Make Training Details Caculation Visible
Browse files
README.md
CHANGED
@@ -26,7 +26,7 @@ We train our models on part of [RedPajama](https://www.together.xyz/blog/redpaja
|
|
26 |
|
27 |
## Training Details
|
28 |
|
29 |
-
The model was trained with ~1T tokens (0.98T). num of tokens = steps
|
30 |
|
31 |
The training curve is at this [WandB project](https://wandb.ai/ahxt/llama2_xs_460M_training_loss/reports/reduced_train_loss-23-09-05-20-25-43---Vmlldzo1MzIwNDUx?accessToken=x2ch3n30jo77p1x8y7q9js4h4d8zpjtz1tzot4xxullyefixp4jwt7au2q37k2q6).
|
32 |
|
|
|
26 |
|
27 |
## Training Details
|
28 |
|
29 |
+
The model was trained with ~1T tokens (0.98T). num of tokens = steps \* length \* batch_size = 499679 \* 1024 \* 192 = 98240888832 ≈ 0.98T.
|
30 |
|
31 |
The training curve is at this [WandB project](https://wandb.ai/ahxt/llama2_xs_460M_training_loss/reports/reduced_train_loss-23-09-05-20-25-43---Vmlldzo1MzIwNDUx?accessToken=x2ch3n30jo77p1x8y7q9js4h4d8zpjtz1tzot4xxullyefixp4jwt7au2q37k2q6).
|
32 |
|