TimeRobber
commited on
Commit
·
488b707
1
Parent(s):
b954a57
Update README.md
Browse files
README.md
CHANGED
@@ -221,19 +221,16 @@ It was pretrained on mC4 and then finetuned on xP3, P3 or xP3mt.
|
|
221 |
## Speeds, Sizes, Times
|
222 |
|
223 |
// TODO @adarob: Maybe we can push tensorboard on this repo as well
|
224 |
-
Training logs:
|
225 |
|
226 |
-
-
|
227 |
-
|
228 |
-
|
229 |
|
230 |
- Number of epochs: 1
|
231 |
|
232 |
|
233 |
## Environmental Impact
|
234 |
|
235 |
-
// TODO @adarob: Is it possible for you to share some information about the impact of where you trained it?
|
236 |
-
|
237 |
The evaluation supercomputer, [Jean Zay](http://www.idris.fr/eng/jean-zay/), uses mostly nuclear energy. The heat generated by it is reused for heating campus housing.
|
238 |
|
239 |
</details>
|
|
|
221 |
## Speeds, Sizes, Times
|
222 |
|
223 |
// TODO @adarob: Maybe we can push tensorboard on this repo as well
|
|
|
224 |
|
225 |
+
- Training logs:
|
226 |
+
|
227 |
+
- Checkpoint size: 51.7GB (Bf16 weights)
|
228 |
|
229 |
- Number of epochs: 1
|
230 |
|
231 |
|
232 |
## Environmental Impact
|
233 |
|
|
|
|
|
234 |
The evaluation supercomputer, [Jean Zay](http://www.idris.fr/eng/jean-zay/), uses mostly nuclear energy. The heat generated by it is reused for heating campus housing.
|
235 |
|
236 |
</details>
|