Update README.md
Browse files
README.md
CHANGED
|
@@ -39,17 +39,36 @@ Meta developed and publicly released the Llama 2 family of large language models
|
|
| 39 |
|
| 40 |
### Training hyperparameters ⚙
|
| 41 |
|
| 42 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
### Training results 🗒️
|
| 45 |
|
|
|
|
| 46 |
| Step | Training Loss | Validation Loss |
|
| 47 |
-
|
| 48 |
-
|
|
| 49 |
-
|
|
| 50 |
-
|
|
| 51 |
-
|
|
| 52 |
-
|
|
| 53 |
|
| 54 |
|
| 55 |
|
|
|
|
| 39 |
|
| 40 |
### Training hyperparameters ⚙
|
| 41 |
|
| 42 |
+
```py
|
| 43 |
+
optim="paged_adamw_32bit",
|
| 44 |
+
num_train_epochs = 2,
|
| 45 |
+
eval_steps=50,
|
| 46 |
+
save_steps=50,
|
| 47 |
+
evaluation_strategy="steps",
|
| 48 |
+
save_strategy="steps",
|
| 49 |
+
save_total_limit=2,
|
| 50 |
+
seed=66,
|
| 51 |
+
load_best_model_at_end=True,
|
| 52 |
+
logging_steps=1,
|
| 53 |
+
learning_rate=2e-4,
|
| 54 |
+
fp16=True,
|
| 55 |
+
bf16=False,
|
| 56 |
+
max_grad_norm=0.3,
|
| 57 |
+
warmup_ratio=0.03,
|
| 58 |
+
group_by_length=True,
|
| 59 |
+
lr_scheduler_type="constant"
|
| 60 |
+
```
|
| 61 |
|
| 62 |
### Training results 🗒️
|
| 63 |
|
| 64 |
+
|
| 65 |
| Step | Training Loss | Validation Loss |
|
| 66 |
+
|------|----------|----------|
|
| 67 |
+
| 50 | 0.624400 | 0.600070 |
|
| 68 |
+
| 100 | 0.634100 | 0.592757 |
|
| 69 |
+
| 150 | 0.545800 | 0.586652 |
|
| 70 |
+
| 200 | 0.572500 | 0.577525 |
|
| 71 |
+
| 250 | 0.528000 | 0.590118 |
|
| 72 |
|
| 73 |
|
| 74 |
|