Saving weights at step 300000, update README
Browse files
README.md
CHANGED
@@ -19,6 +19,7 @@ datasets:
|
|
19 |
Dataset:
|
20 |
|
21 |
* [mC4 NL Cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned)
|
|
|
22 |
* dataset config: full (33B tokens)
|
23 |
|
24 |
Tokenizer:
|
@@ -28,14 +29,15 @@ Tokenizer:
|
|
28 |
|
29 |
Training details:
|
30 |
|
31 |
-
* Trained for
|
|
|
|
|
32 |
* Block size: 512
|
33 |
* Optimizer: adafactor
|
34 |
* lr: 5e-5
|
35 |
-
* Batch size: 64
|
36 |
* Warmup steps: 5000
|
37 |
|
38 |
-
Work in progress.
|
39 |
|
40 |
* Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
|
41 |
* Thanks to @gsarti for creating the [t5-flax-gcp
|
|
|
19 |
Dataset:
|
20 |
|
21 |
* [mC4 NL Cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned)
|
22 |
+
* dataset config: tiny (3B tokens)
|
23 |
* dataset config: full (33B tokens)
|
24 |
|
25 |
Tokenizer:
|
|
|
29 |
|
30 |
Training details:
|
31 |
|
32 |
+
* Trained for 70K steps (batch size 64) to ppl 27 on mc4 nl tiny 1 epoch
|
33 |
+
* Trained for 300K steps (batch size 16) to ppl 20.4 on mc4 nl full
|
34 |
+
* Training continuing
|
35 |
* Block size: 512
|
36 |
* Optimizer: adafactor
|
37 |
* lr: 5e-5
|
|
|
38 |
* Warmup steps: 5000
|
39 |
|
40 |
+
Work in progress. Jan 2022
|
41 |
|
42 |
* Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
|
43 |
* Thanks to @gsarti for creating the [t5-flax-gcp
|
flax_model.msgpack
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5262314590
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b4b4e9bbfc621b6f5f40d464b9a457e710c116168f3175beec1363fd2a531006
|
3 |
size 5262314590
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5363100545
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0ab4fce27a9dee2817a712b540727128d366cc77783f654308348bbfafd61d0c
|
3 |
size 5363100545
|
runs/events.out.tfevents.1641156371.t1v-n-2f64d7c8-w-0.13342.0.v2
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:76f565900752295f8ea814d3b2d748344955071342c221463fbf4b7f6e21a37c
|
3 |
+
size 46284021
|