mt5-small-finetuned-easy-read

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.6421
Rouge1: 12.4785
Rouge2: 7.297
Rougel: 11.2459
Rougelsum: 11.8977
Gen Len: 20.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	1.0	450	2.1751	3.0651	1.4242	2.7239	2.9793	8.91
8.7064	2.0	900	2.0039	7.9637	3.5462	7.2607	7.7483	17.7633
3.0928	3.0	1350	1.9316	9.9888	4.4076	9.0744	9.736	19.6767
2.7145	4.0	1800	1.8842	10.663	4.7603	9.6266	10.3352	19.975
2.5488	5.0	2250	1.8269	12.2099	6.5741	10.9447	11.7951	19.9583
2.4292	6.0	2700	1.8062	12.3732	6.5612	11.0588	11.9345	19.9683
2.3469	7.0	3150	1.7903	12.393	6.8323	11.0593	11.918	19.97
2.2962	8.0	3600	1.7783	12.5416	7.129	11.2943	12.0636	19.9883
2.2392	9.0	4050	1.7542	12.569	7.3297	11.3892	12.1161	19.9833
2.1997	10.0	4500	1.7367	12.2996	7.0573	11.1477	11.7957	19.9817
2.1997	11.0	4950	1.7292	12.3945	7.0875	11.2196	11.8729	19.9983
2.167	12.0	5400	1.7141	12.5922	7.3044	11.3837	12.0437	20.0
2.1364	13.0	5850	1.7055	12.7888	7.4252	11.4793	12.2255	20.0
2.1001	14.0	6300	1.7050	12.8441	7.5659	11.4877	12.2913	20.0
2.0948	15.0	6750	1.6888	12.4564	7.2591	11.2335	11.9194	20.0
2.0641	16.0	7200	1.6814	12.6283	7.3345	11.3652	12.0453	20.0
2.0493	17.0	7650	1.6792	12.521	7.2919	11.2711	11.9666	20.0
2.0278	18.0	8100	1.6707	12.6528	7.4801	11.4206	12.1064	20.0
2.0158	19.0	8550	1.6567	12.3323	7.1938	11.1236	11.7917	20.0
2.0094	20.0	9000	1.6614	12.3774	7.2186	11.1618	11.8052	20.0
2.0094	21.0	9450	1.6608	12.4445	7.3107	11.1788	11.8749	20.0
1.9939	22.0	9900	1.6520	12.4746	7.2967	11.2498	11.9068	20.0
1.9834	23.0	10350	1.6478	12.4374	7.252	11.198	11.8311	20.0
1.976	24.0	10800	1.6477	12.3398	7.1823	11.1211	11.7798	20.0
1.9779	25.0	11250	1.6486	12.2964	7.1372	11.1039	11.7536	20.0
1.9566	26.0	11700	1.6430	12.4138	7.2163	11.2127	11.8479	20.0
1.9639	27.0	12150	1.6422	12.4914	7.2668	11.2499	11.922	20.0
1.9595	28.0	12600	1.6441	12.5039	7.3107	11.2503	11.9109	20.0
1.9539	29.0	13050	1.6422	12.4762	7.2738	11.2411	11.905	20.0
1.9546	30.0	13500	1.6421	12.4785	7.297	11.2459	11.8977	20.0

Framework versions

Transformers 4.49.0
Pytorch 2.0.0+cu117
Datasets 3.4.0
Tokenizers 0.21.0

joheras
/

mt5-small-finetuned-easy-read

mt5-small-finetuned-easy-read

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for joheras/mt5-small-finetuned-easy-read

Evaluation results