Update README.md
Browse files
README.md
CHANGED
@@ -73,7 +73,7 @@ The training data was pre-processed with a `PreTrainedTokenizerFast()` trained o
|
|
73 |
- **Gradient accumulation steps:** 4
|
74 |
- **Mixed precision:** fp16, native amp
|
75 |
- **Learning rate:** 0.0025
|
76 |
-
- **
|
77 |
- **Learning rate scheduler warmup:** 0.1
|
78 |
- **Optimizer:** AdamW with betas=(0.9,0.95) and epsilon=1e-08
|
79 |
- **Number of epochs:** 50
|
|
|
73 |
- **Gradient accumulation steps:** 4
|
74 |
- **Mixed precision:** fp16, native amp
|
75 |
- **Learning rate:** 0.0025
|
76 |
+
- **Learning rate scheduler:** Cosine
|
77 |
- **Learning rate scheduler warmup:** 0.1
|
78 |
- **Optimizer:** AdamW with betas=(0.9,0.95) and epsilon=1e-08
|
79 |
- **Number of epochs:** 50
|