t5-abs-2309-1054-lr-1e-05-bs-5-maxep-20
This model is a fine-tuned version of google-t5/t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 4.0908
- Rouge/rouge1: 0.4752
- Rouge/rouge2: 0.2304
- Rouge/rougel: 0.4054
- Rouge/rougelsum: 0.4058
- Bertscore/bertscore-precision: 0.8974
- Bertscore/bertscore-recall: 0.8993
- Bertscore/bertscore-f1: 0.8982
- Meteor: 0.4445
- Gen Len: 41.7091
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 5
- eval_batch_size: 5
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 10
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge/rouge1 | Rouge/rouge2 | Rouge/rougel | Rouge/rougelsum | Bertscore/bertscore-precision | Bertscore/bertscore-recall | Bertscore/bertscore-f1 | Meteor | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0.0043 | 1.0 | 87 | 3.9670 | 0.4794 | 0.2341 | 0.4098 | 0.4105 | 0.8988 | 0.9001 | 0.8993 | 0.4454 | 41.3091 |
0.0021 | 2.0 | 174 | 3.9846 | 0.482 | 0.2397 | 0.4136 | 0.4144 | 0.8988 | 0.8999 | 0.8993 | 0.4495 | 41.2182 |
0.0026 | 3.0 | 261 | 4.0097 | 0.4788 | 0.2365 | 0.4095 | 0.4104 | 0.8982 | 0.8995 | 0.8987 | 0.4461 | 41.3273 |
0.0028 | 4.0 | 348 | 4.0332 | 0.4773 | 0.2371 | 0.4078 | 0.4086 | 0.8974 | 0.8989 | 0.898 | 0.4476 | 41.6909 |
0.0027 | 5.0 | 435 | 4.0492 | 0.4799 | 0.2368 | 0.4087 | 0.4095 | 0.8981 | 0.8997 | 0.8988 | 0.4493 | 41.6818 |
0.0023 | 6.0 | 522 | 4.0660 | 0.4766 | 0.2319 | 0.405 | 0.4055 | 0.8971 | 0.899 | 0.8979 | 0.4466 | 41.8273 |
0.0023 | 7.0 | 609 | 4.0819 | 0.4777 | 0.2334 | 0.4066 | 0.407 | 0.8978 | 0.8988 | 0.8982 | 0.4457 | 41.5273 |
0.0023 | 8.0 | 696 | 4.0912 | 0.4799 | 0.2336 | 0.4085 | 0.4092 | 0.8979 | 0.8994 | 0.8985 | 0.4496 | 41.6364 |
0.0021 | 9.0 | 783 | 4.1035 | 0.4774 | 0.2328 | 0.4067 | 0.4075 | 0.8979 | 0.899 | 0.8983 | 0.4456 | 41.5909 |
0.0025 | 10.0 | 870 | 4.1177 | 0.4769 | 0.2321 | 0.4058 | 0.4064 | 0.898 | 0.8989 | 0.8983 | 0.4438 | 41.1727 |
0.0124 | 11.0 | 957 | 4.1056 | 0.4773 | 0.2327 | 0.4065 | 0.4069 | 0.8974 | 0.8992 | 0.8982 | 0.4466 | 41.7545 |
0.0119 | 12.0 | 1044 | 4.1007 | 0.4737 | 0.2291 | 0.4029 | 0.4036 | 0.8968 | 0.8992 | 0.8979 | 0.4442 | 41.9727 |
0.0119 | 13.0 | 1131 | 4.0992 | 0.4737 | 0.2303 | 0.4035 | 0.4037 | 0.8968 | 0.8987 | 0.8976 | 0.4416 | 41.6455 |
0.0117 | 14.0 | 1218 | 4.0943 | 0.4763 | 0.2302 | 0.4058 | 0.4058 | 0.8973 | 0.8989 | 0.898 | 0.4433 | 41.6273 |
0.0102 | 15.0 | 1305 | 4.0950 | 0.4744 | 0.2296 | 0.4041 | 0.4047 | 0.8971 | 0.899 | 0.8979 | 0.4434 | 41.7727 |
0.0105 | 16.0 | 1392 | 4.0931 | 0.474 | 0.2286 | 0.4033 | 0.4039 | 0.8972 | 0.8991 | 0.898 | 0.4431 | 41.7818 |
0.0096 | 17.0 | 1479 | 4.0920 | 0.4743 | 0.2298 | 0.4049 | 0.4052 | 0.8973 | 0.8992 | 0.8981 | 0.4431 | 41.6909 |
0.01 | 18.0 | 1566 | 4.0910 | 0.4756 | 0.23 | 0.4055 | 0.4055 | 0.8972 | 0.899 | 0.898 | 0.4439 | 41.6818 |
0.0105 | 19.0 | 1653 | 4.0911 | 0.4752 | 0.2306 | 0.4057 | 0.406 | 0.8974 | 0.8993 | 0.8982 | 0.4444 | 41.6727 |
0.0094 | 20.0 | 1740 | 4.0908 | 0.4752 | 0.2304 | 0.4054 | 0.4058 | 0.8974 | 0.8993 | 0.8982 | 0.4445 | 41.7091 |
Framework versions
- Transformers 4.44.0
- Pytorch 2.4.0
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 1
Model tree for roequitz/t5-abs-2309-1054-lr-1e-05-bs-5-maxep-20
Base model
google-t5/t5-base