mt5-small-finetuned-easy-read

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6421
  • Rouge1: 12.4785
  • Rouge2: 7.297
  • Rougel: 11.2459
  • Rougelsum: 11.8977
  • Gen Len: 20.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 450 2.1751 3.0651 1.4242 2.7239 2.9793 8.91
8.7064 2.0 900 2.0039 7.9637 3.5462 7.2607 7.7483 17.7633
3.0928 3.0 1350 1.9316 9.9888 4.4076 9.0744 9.736 19.6767
2.7145 4.0 1800 1.8842 10.663 4.7603 9.6266 10.3352 19.975
2.5488 5.0 2250 1.8269 12.2099 6.5741 10.9447 11.7951 19.9583
2.4292 6.0 2700 1.8062 12.3732 6.5612 11.0588 11.9345 19.9683
2.3469 7.0 3150 1.7903 12.393 6.8323 11.0593 11.918 19.97
2.2962 8.0 3600 1.7783 12.5416 7.129 11.2943 12.0636 19.9883
2.2392 9.0 4050 1.7542 12.569 7.3297 11.3892 12.1161 19.9833
2.1997 10.0 4500 1.7367 12.2996 7.0573 11.1477 11.7957 19.9817
2.1997 11.0 4950 1.7292 12.3945 7.0875 11.2196 11.8729 19.9983
2.167 12.0 5400 1.7141 12.5922 7.3044 11.3837 12.0437 20.0
2.1364 13.0 5850 1.7055 12.7888 7.4252 11.4793 12.2255 20.0
2.1001 14.0 6300 1.7050 12.8441 7.5659 11.4877 12.2913 20.0
2.0948 15.0 6750 1.6888 12.4564 7.2591 11.2335 11.9194 20.0
2.0641 16.0 7200 1.6814 12.6283 7.3345 11.3652 12.0453 20.0
2.0493 17.0 7650 1.6792 12.521 7.2919 11.2711 11.9666 20.0
2.0278 18.0 8100 1.6707 12.6528 7.4801 11.4206 12.1064 20.0
2.0158 19.0 8550 1.6567 12.3323 7.1938 11.1236 11.7917 20.0
2.0094 20.0 9000 1.6614 12.3774 7.2186 11.1618 11.8052 20.0
2.0094 21.0 9450 1.6608 12.4445 7.3107 11.1788 11.8749 20.0
1.9939 22.0 9900 1.6520 12.4746 7.2967 11.2498 11.9068 20.0
1.9834 23.0 10350 1.6478 12.4374 7.252 11.198 11.8311 20.0
1.976 24.0 10800 1.6477 12.3398 7.1823 11.1211 11.7798 20.0
1.9779 25.0 11250 1.6486 12.2964 7.1372 11.1039 11.7536 20.0
1.9566 26.0 11700 1.6430 12.4138 7.2163 11.2127 11.8479 20.0
1.9639 27.0 12150 1.6422 12.4914 7.2668 11.2499 11.922 20.0
1.9595 28.0 12600 1.6441 12.5039 7.3107 11.2503 11.9109 20.0
1.9539 29.0 13050 1.6422 12.4762 7.2738 11.2411 11.905 20.0
1.9546 30.0 13500 1.6421 12.4785 7.297 11.2459 11.8977 20.0

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.0.0+cu117
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
8
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for joheras/mt5-small-finetuned-easy-read

Base model

google/mt5-small
Finetuned
(448)
this model