End of training

Browse files

Files changed (7) hide show

README.md +33 -13
config.json +1 -1
logs/events.out.tfevents.1714285176.ip-10-25-205-144.266412.1 +3 -0
logs/events.out.tfevents.1714287609.ip-10-25-205-144.266412.2 +3 -0
logs/events.out.tfevents.1714289724.ip-10-25-205-144.266412.3 +3 -0
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,25 +1,19 @@
 ---
-base_model: google/byt5-small
 tags:
 - generated_from_trainer
 model-index:
-- name: byt5_add_1k
  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# byt5_add_1k
-This model is a fine-tuned version of [AlexWang99/byt5_add_1k/checkpoint-86](https://huggingface.co/AlexWang99/byt5_add_1k/checkpoint-86) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- eval_loss: 0.4665
-- eval_runtime: 10.7864
-- eval_samples_per_second: 927.092
-- eval_steps_per_second: 1.205
-- epoch: 32.0
-- step: 64
 ## Model description
@@ -38,13 +32,39 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 800
 - eval_batch_size: 800
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 80
 ### Framework versions

 ---
 tags:
 - generated_from_trainer
 model-index:
+- name: byt5_1k
  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# byt5_1k
+This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0868
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 400
 - eval_batch_size: 800
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 20
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| No log | 1.0 | 3 | 0.1081 |
+| No log | 2.0 | 6 | 0.0983 |
+| No log | 3.0 | 9 | 0.1285 |
+| 0.1432 | 4.0 | 12 | 0.0961 |
+| 0.1432 | 5.0 | 15 | 0.1040 |
+| 0.1432 | 6.0 | 18 | 0.1032 |
+| 0.1488 | 7.0 | 21 | 0.0938 |
+| 0.1488 | 8.0 | 24 | 0.0979 |
+| 0.1488 | 9.0 | 27 | 0.0976 |
+| 0.1375 | 10.0 | 30 | 0.0885 |
+| 0.1375 | 11.0 | 33 | 0.0907 |
+| 0.1375 | 12.0 | 36 | 0.0863 |
+| 0.1375 | 13.0 | 39 | 0.0843 |
+| 0.1297 | 14.0 | 42 | 0.0833 |
+| 0.1297 | 15.0 | 45 | 0.0840 |
+| 0.1297 | 16.0 | 48 | 0.0861 |
+| 0.1241 | 17.0 | 51 | 0.0903 |
+| 0.1241 | 18.0 | 54 | 0.0891 |
+| 0.1241 | 19.0 | 57 | 0.0876 |
+| 0.1185 | 20.0 | 60 | 0.0868 |
 ### Framework versions

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
- "_name_or_path": "google/byt5-small",
  "architectures": [
  "T5ForConditionalGeneration"
  ],

 {
+ "_name_or_path": "AlexWang99/byt5_1k",
  "architectures": [
  "T5ForConditionalGeneration"
  ],

logs/events.out.tfevents.1714285176.ip-10-25-205-144.266412.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b21af001fe5158c276dc24fdcff6d1e0b1dc82469343c4470940e188733512fb
+size 25835

logs/events.out.tfevents.1714287609.ip-10-25-205-144.266412.2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1b94955ee84f5953be16af7990569559175738ccf0ab05b30e1681501329ca62
+size 20569

logs/events.out.tfevents.1714289724.ip-10-25-205-144.266412.3 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1ef606384881afd7ee6220395ef6dd0dab0b554a58bbf2b2d68dcd37ab52ad1c
+size 11148

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b1acc4b999f06bdef9036c06dcd7b3f67519c24515a6ec23dcfcfc5abe6feb43
 size 1198571496

 version https://git-lfs.github.com/spec/v1
+oid sha256:b1d38312142676df106c129e100839b6de37257ace7652c5df8d40b1aa17cbdb
 size 1198571496

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:160d58158a6ec3f8a009c04569665c565cfda06315a5577cb2100ca9803c9bab
 size 4792

 version https://git-lfs.github.com/spec/v1
+oid sha256:cab6b39dc088aa20d7417697114e95ffad3ae5054465ee82549b2fc0dc90a37d
 size 4792