Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,6 @@ pipeline_tag: text-generation
|
|
27 |
- TPU V2-8
|
28 |
- Learning Rate: 6e-4, Batch Size: 512(=64 accum x 8 devices), Scheduler: Linear, WarmUp: 1000 step
|
29 |
- Optimizer: AdamW(adam_beta1=0.9 adam_beta2=0.98, weight_decay=0.01)
|
30 |
-
- bfloat16
|
31 |
- Training Steps: 43247 (3 epoch)
|
32 |
- 학습 토큰 수: 21.11B (43247 * 512 * 1024seq / 1024^3)
|
33 |
- 학습 기간: 2023/2/16 ~ 2023/2/18(2일 22시간 소요)
|
|
|
27 |
- TPU V2-8
|
28 |
- Learning Rate: 6e-4, Batch Size: 512(=64 accum x 8 devices), Scheduler: Linear, WarmUp: 1000 step
|
29 |
- Optimizer: AdamW(adam_beta1=0.9 adam_beta2=0.98, weight_decay=0.01)
|
|
|
30 |
- Training Steps: 43247 (3 epoch)
|
31 |
- 학습 토큰 수: 21.11B (43247 * 512 * 1024seq / 1024^3)
|
32 |
- 학습 기간: 2023/2/16 ~ 2023/2/18(2일 22시간 소요)
|