teddy-f-47
commited on
Commit
•
1b49982
1
Parent(s):
12482b0
Update README.md
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ The 20231201 Polish Wikipedia dump.
|
|
41 |
|
42 |
The following hyperparameters were used during training:
|
43 |
- learning_rate: 0.0002
|
44 |
-
- distributed_type: multi-GPU
|
45 |
- num_devices: 4
|
46 |
- train_batch_size: 2
|
47 |
- gradient_accumulation_steps: 8
|
|
|
41 |
|
42 |
The following hyperparameters were used during training:
|
43 |
- learning_rate: 0.0002
|
44 |
+
- distributed_type: multi-GPU (DDP)
|
45 |
- num_devices: 4
|
46 |
- train_batch_size: 2
|
47 |
- gradient_accumulation_steps: 8
|