Update README.md
Browse files
README.md
CHANGED
@@ -39,17 +39,36 @@ Meta developed and publicly released the Llama 2 family of large language models
|
|
39 |
|
40 |
### Training hyperparameters ⚙
|
41 |
|
42 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
|
44 |
### Training results 🗒️
|
45 |
|
|
|
46 |
| Step | Training Loss | Validation Loss |
|
47 |
-
|
48 |
-
|
|
49 |
-
|
|
50 |
-
|
|
51 |
-
|
|
52 |
-
|
|
53 |
|
54 |
|
55 |
|
|
|
39 |
|
40 |
### Training hyperparameters ⚙
|
41 |
|
42 |
+
```py
|
43 |
+
optim="paged_adamw_32bit",
|
44 |
+
num_train_epochs = 2,
|
45 |
+
eval_steps=50,
|
46 |
+
save_steps=50,
|
47 |
+
evaluation_strategy="steps",
|
48 |
+
save_strategy="steps",
|
49 |
+
save_total_limit=2,
|
50 |
+
seed=66,
|
51 |
+
load_best_model_at_end=True,
|
52 |
+
logging_steps=1,
|
53 |
+
learning_rate=2e-4,
|
54 |
+
fp16=True,
|
55 |
+
bf16=False,
|
56 |
+
max_grad_norm=0.3,
|
57 |
+
warmup_ratio=0.03,
|
58 |
+
group_by_length=True,
|
59 |
+
lr_scheduler_type="constant"
|
60 |
+
```
|
61 |
|
62 |
### Training results 🗒️
|
63 |
|
64 |
+
|
65 |
| Step | Training Loss | Validation Loss |
|
66 |
+
|------|----------|----------|
|
67 |
+
| 50 | 0.624400 | 0.600070 |
|
68 |
+
| 100 | 0.634100 | 0.592757 |
|
69 |
+
| 150 | 0.545800 | 0.586652 |
|
70 |
+
| 200 | 0.572500 | 0.577525 |
|
71 |
+
| 250 | 0.528000 | 0.590118 |
|
72 |
|
73 |
|
74 |
|