Update README.md
Browse files
README.md
CHANGED
@@ -28,12 +28,11 @@ The model is trained for four epochs on the [crumb/flan-ul2-tinystories](https:/
|
|
28 |
Training arguments:
|
29 |
|
30 |
```
|
31 |
-
per_device_train_batch_size=
|
32 |
-
gradient_accumulation_steps=
|
33 |
warmup_steps=128,
|
34 |
num_train_epochs=4,
|
35 |
learning_rate=2e-4,
|
36 |
-
bf16=True,
|
37 |
eval_steps=64,
|
38 |
optim="adamw_torch",
|
39 |
```
|
|
|
28 |
Training arguments:
|
29 |
|
30 |
```
|
31 |
+
per_device_train_batch_size=16,
|
32 |
+
gradient_accumulation_steps=8,
|
33 |
warmup_steps=128,
|
34 |
num_train_epochs=4,
|
35 |
learning_rate=2e-4,
|
|
|
36 |
eval_steps=64,
|
37 |
optim="adamw_torch",
|
38 |
```
|