Saving weights and log at step 760000

Files changed (6) hide show

README.md CHANGED Viewed

@@ -30,7 +30,7 @@ Tokenizer:
 Training details:
 * Trained for 70K steps (batch size 64) to ppl 27 on mc4 nl tiny 1 epoch
-* Trained for 620K steps (batch size 16) to ppl 17.5 on mc4 nl full
 * Training continuing
 * Block size: 512
 * Optimizer: adafactor

 Training details:
 * Trained for 70K steps (batch size 64) to ppl 27 on mc4 nl tiny 1 epoch
+* Trained for 760K steps (batch size 16) to ppl 16.8 on mc4 nl full
 * Training continuing
 * Block size: 512
 * Optimizer: adafactor

flax_model.msgpack CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:afec4b5e767f28df03f1d9c2ac05c078d2ef04177c91043bc21016f626ab4057
 size 5262314590

 version https://git-lfs.github.com/spec/v1
+oid sha256:9219656705501e15f9f93b78df01c8a339552af0685161f55febd8dc2edca3fc
 size 5262314590

opt_state.msgpack ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:9415497a6e41b76b0baa60a31beb021fe2a13f11f513106d350c89beda73f7f6
+size 5778100

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:482129627edcc72671fcb0c8202d1fac13ece807b6487ce4d3e36dfd1123448d
 size 5363100545

 version https://git-lfs.github.com/spec/v1
+oid sha256:71eac87d5c3e71477204c4b97e36e0beb17c43686afc160ee20955316ab50c80
 size 5363100545

runs/events.out.tfevents.1641156371.t1v-n-2f64d7c8-w-0.13342.0.v2 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:29639ddd64e571d7e5e9a924fbf2c16b5c5b228f07e631b0903c8b95319f7b54
-size 93197623

 version https://git-lfs.github.com/spec/v1
+oid sha256:909e7ac40e6afe9723bd239188da21469a915ed6c44b30318bdca1e48dd9ba04
+size 114081255

training_state.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"step": 760001}