mt5-small-finetuned-cnn-dailymail-en_nlp-course-chapter7-section4

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6856
  • Rouge1: 33.1062
  • Rouge2: 17.2193
  • Rougel: 29.2904
  • Rougelsum: 31.0641

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.5643 1.0 35890 1.8832 32.0495 16.2168 28.2966 30.0297
2.2043 2.0 71780 1.7913 32.3942 16.6084 28.6015 30.3686
2.1059 3.0 107670 1.7545 32.5226 16.7984 28.8285 30.5446
2.0453 4.0 143560 1.7311 32.4746 16.7548 28.7544 30.4887
2.0033 5.0 179450 1.7099 33.1011 17.2836 29.3355 31.0773
1.9721 6.0 215340 1.6970 33.0955 17.1866 29.2639 31.0288
1.9501 7.0 251230 1.6915 33.2428 17.3318 29.4191 31.1813
1.9355 8.0 287120 1.6856 33.1062 17.2193 29.2904 31.0641

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.11.0+cu102
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
141
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for BanUrsus/mt5-small-finetuned-cnn-dailymail-en_nlp-course-chapter7-section4

Base model

google/mt5-small
Finetuned
(370)
this model