metadata

license: apache-2.0
base_model: google/mt5-base
tags:
  - generated_from_trainer
metrics:
  - rouge
  - sacrebleu
model-index:
  - name: mT5-TextSimp-LT-BatchSize4-lr1e-4
    results: []

mT5-TextSimp-LT-BatchSize4-lr1e-4

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0737
Rouge1: 0.7174
Rouge2: 0.5553
Rougel: 0.7108
Sacrebleu: 43.3127
Gen Len: 38.0501

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Sacrebleu	Gen Len
25.4634	0.48	200	18.5157	0.0061	0.0	0.0059	0.0008	512.0
1.1451	0.96	400	0.6596	0.0161	0.0003	0.0154	0.0215	39.0453
0.6441	1.44	600	0.4981	0.0272	0.0012	0.0259	0.0166	39.0453
0.247	1.91	800	0.1420	0.4769	0.2826	0.465	20.3212	38.0501
0.1549	2.39	1000	0.1032	0.6114	0.4299	0.5998	30.2603	38.0501
0.1482	2.87	1200	0.0934	0.6592	0.4815	0.6496	34.4213	38.0501
0.1163	3.35	1400	0.0867	0.6734	0.4968	0.6651	36.3741	38.0501
0.1042	3.83	1600	0.0816	0.6826	0.5127	0.6753	38.128	38.0501
0.1109	4.31	1800	0.0816	0.6893	0.5191	0.6818	39.3294	38.0501
0.1029	4.78	2000	0.0798	0.6968	0.5284	0.6901	40.5064	38.0501
0.0877	5.26	2200	0.0766	0.7006	0.5372	0.694	40.5295	38.0501
0.0748	5.74	2400	0.0759	0.7092	0.5403	0.7028	41.4424	38.0501
0.0941	6.22	2600	0.0754	0.7134	0.5471	0.7066	42.4212	38.0501
0.1095	6.7	2800	0.0737	0.7198	0.5547	0.7135	42.8225	38.0501
0.0749	7.18	3000	0.0735	0.7165	0.5536	0.7107	42.9748	38.0501
0.073	7.66	3200	0.0737	0.7174	0.5553	0.7108	43.3127	38.0501

Framework versions

Transformers 4.33.0
Pytorch 2.1.2+cu121
Datasets 2.14.4
Tokenizers 0.13.3