yhavinga commited on
Commit
b7f644f
·
1 Parent(s): 9a77f4b

Saving weights at step 300000, update README

Browse files
README.md CHANGED
@@ -19,6 +19,7 @@ datasets:
19
  Dataset:
20
 
21
  * [mC4 NL Cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned)
 
22
  * dataset config: full (33B tokens)
23
 
24
  Tokenizer:
@@ -28,14 +29,15 @@ Tokenizer:
28
 
29
  Training details:
30
 
31
- * Trained for ? (1 jan 2022)
 
 
32
  * Block size: 512
33
  * Optimizer: adafactor
34
  * lr: 5e-5
35
- * Batch size: 64
36
  * Warmup steps: 5000
37
 
38
- Work in progress. Jan2022
39
 
40
  * Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
41
  * Thanks to @gsarti for creating the [t5-flax-gcp
 
19
  Dataset:
20
 
21
  * [mC4 NL Cleaned](https://huggingface.co/datasets/yhavinga/mc4_nl_cleaned)
22
+ * dataset config: tiny (3B tokens)
23
  * dataset config: full (33B tokens)
24
 
25
  Tokenizer:
 
29
 
30
  Training details:
31
 
32
+ * Trained for 70K steps (batch size 64) to ppl 27 on mc4 nl tiny 1 epoch
33
+ * Trained for 300K steps (batch size 16) to ppl 20.4 on mc4 nl full
34
+ * Training continuing
35
  * Block size: 512
36
  * Optimizer: adafactor
37
  * lr: 5e-5
 
38
  * Warmup steps: 5000
39
 
40
+ Work in progress. Jan 2022
41
 
42
  * Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
43
  * Thanks to @gsarti for creating the [t5-flax-gcp
flax_model.msgpack CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:47dea64ce99676b0c506b73e3f9f0fd217d0150efcdae4f2074145f026f1bf28
3
  size 5262314590
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b4b4e9bbfc621b6f5f40d464b9a457e710c116168f3175beec1363fd2a531006
3
  size 5262314590
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:411902806d7c0eac5e0a103f3548f689283e222df7b13e9144bbbf2b53b9868d
3
  size 5363100545
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ab4fce27a9dee2817a712b540727128d366cc77783f654308348bbfafd61d0c
3
  size 5363100545
runs/events.out.tfevents.1641156371.t1v-n-2f64d7c8-w-0.13342.0.v2 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:29250dcf0765c89d9572867e8e482f06264ec823cd2ba8b6a51bf326096a1bff
3
- size 46209307
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76f565900752295f8ea814d3b2d748344955071342c221463fbf4b7f6e21a37c
3
+ size 46284021