AlexWang99 commited on
Commit
afb302f
1 Parent(s): c974aac

End of training

Browse files
README.md CHANGED
@@ -1,25 +1,19 @@
1
  ---
2
- base_model: google/byt5-small
3
  tags:
4
  - generated_from_trainer
5
  model-index:
6
- - name: byt5_add_1k
7
  results: []
8
  ---
9
 
10
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
  should probably proofread and complete it, then remove this comment. -->
12
 
13
- # byt5_add_1k
14
 
15
- This model is a fine-tuned version of [AlexWang99/byt5_add_1k/checkpoint-86](https://huggingface.co/AlexWang99/byt5_add_1k/checkpoint-86) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
- - eval_loss: 0.4665
18
- - eval_runtime: 10.7864
19
- - eval_samples_per_second: 927.092
20
- - eval_steps_per_second: 1.205
21
- - epoch: 32.0
22
- - step: 64
23
 
24
  ## Model description
25
 
@@ -38,13 +32,39 @@ More information needed
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
- - learning_rate: 5e-05
42
- - train_batch_size: 800
43
  - eval_batch_size: 800
44
  - seed: 42
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: linear
47
- - num_epochs: 80
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
  ### Framework versions
50
 
 
1
  ---
 
2
  tags:
3
  - generated_from_trainer
4
  model-index:
5
+ - name: byt5_1k
6
  results: []
7
  ---
8
 
9
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
  should probably proofread and complete it, then remove this comment. -->
11
 
12
+ # byt5_1k
13
 
14
+ This model was trained from scratch on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Loss: 0.0868
 
 
 
 
 
17
 
18
  ## Model description
19
 
 
32
  ### Training hyperparameters
33
 
34
  The following hyperparameters were used during training:
35
+ - learning_rate: 0.0001
36
+ - train_batch_size: 400
37
  - eval_batch_size: 800
38
  - seed: 42
39
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
  - lr_scheduler_type: linear
41
+ - num_epochs: 20
42
+
43
+ ### Training results
44
+
45
+ | Training Loss | Epoch | Step | Validation Loss |
46
+ |:-------------:|:-----:|:----:|:---------------:|
47
+ | No log | 1.0 | 3 | 0.1081 |
48
+ | No log | 2.0 | 6 | 0.0983 |
49
+ | No log | 3.0 | 9 | 0.1285 |
50
+ | 0.1432 | 4.0 | 12 | 0.0961 |
51
+ | 0.1432 | 5.0 | 15 | 0.1040 |
52
+ | 0.1432 | 6.0 | 18 | 0.1032 |
53
+ | 0.1488 | 7.0 | 21 | 0.0938 |
54
+ | 0.1488 | 8.0 | 24 | 0.0979 |
55
+ | 0.1488 | 9.0 | 27 | 0.0976 |
56
+ | 0.1375 | 10.0 | 30 | 0.0885 |
57
+ | 0.1375 | 11.0 | 33 | 0.0907 |
58
+ | 0.1375 | 12.0 | 36 | 0.0863 |
59
+ | 0.1375 | 13.0 | 39 | 0.0843 |
60
+ | 0.1297 | 14.0 | 42 | 0.0833 |
61
+ | 0.1297 | 15.0 | 45 | 0.0840 |
62
+ | 0.1297 | 16.0 | 48 | 0.0861 |
63
+ | 0.1241 | 17.0 | 51 | 0.0903 |
64
+ | 0.1241 | 18.0 | 54 | 0.0891 |
65
+ | 0.1241 | 19.0 | 57 | 0.0876 |
66
+ | 0.1185 | 20.0 | 60 | 0.0868 |
67
+
68
 
69
  ### Framework versions
70
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "google/byt5-small",
3
  "architectures": [
4
  "T5ForConditionalGeneration"
5
  ],
 
1
  {
2
+ "_name_or_path": "AlexWang99/byt5_1k",
3
  "architectures": [
4
  "T5ForConditionalGeneration"
5
  ],
logs/events.out.tfevents.1714285176.ip-10-25-205-144.266412.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b21af001fe5158c276dc24fdcff6d1e0b1dc82469343c4470940e188733512fb
3
+ size 25835
logs/events.out.tfevents.1714287609.ip-10-25-205-144.266412.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b94955ee84f5953be16af7990569559175738ccf0ab05b30e1681501329ca62
3
+ size 20569
logs/events.out.tfevents.1714289724.ip-10-25-205-144.266412.3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1ef606384881afd7ee6220395ef6dd0dab0b554a58bbf2b2d68dcd37ab52ad1c
3
+ size 11148
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b1acc4b999f06bdef9036c06dcd7b3f67519c24515a6ec23dcfcfc5abe6feb43
3
  size 1198571496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1d38312142676df106c129e100839b6de37257ace7652c5df8d40b1aa17cbdb
3
  size 1198571496
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:160d58158a6ec3f8a009c04569665c565cfda06315a5577cb2100ca9803c9bab
3
  size 4792
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cab6b39dc088aa20d7417697114e95ffad3ae5054465ee82549b2fc0dc90a37d
3
  size 4792