XSum_t5-small_800_adafactor

This model is a fine-tuned version of /content/XSum_t5-small_800_adafactor/checkpoint-11000 on the xsum dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1714
  • Rouge1: 33.022
  • Rouge2: 11.9979
  • Rougel: 26.7476
  • Rougelsum: 26.7402
  • Gen Len: 18.7543

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 25
  • eval_batch_size: 25
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.3404 0.01 100 2.2058 32.4826 11.5807 26.2716 26.2611 18.7842
2.3194 0.02 200 2.2028 32.6393 11.661 26.372 26.3643 18.788
2.3247 0.04 300 2.1999 32.6792 11.6985 26.3876 26.3786 18.7354
2.3276 0.05 400 2.1979 32.6668 11.7272 26.3964 26.3907 18.7957
2.317 0.06 500 2.1957 32.8267 11.8165 26.5075 26.4997 18.7543
2.3214 0.07 600 2.1942 32.8319 11.8064 26.5428 26.5448 18.7693
2.3014 0.09 700 2.1931 32.7136 11.7334 26.4958 26.486 18.7759
2.3294 0.1 800 2.1902 32.6818 11.7684 26.4314 26.4242 18.785
2.299 0.11 900 2.1914 32.672 11.7606 26.4475 26.4367 18.7853
2.3009 0.12 1000 2.1900 32.7816 11.7958 26.5167 26.5099 18.7685
2.2913 0.13 1100 2.1885 32.6438 11.7398 26.4077 26.4051 18.7742
2.293 0.15 1200 2.1854 32.8228 11.841 26.548 26.5415 18.7899
2.2857 0.16 1300 2.1853 32.7118 11.7439 26.4989 26.4941 18.7998
2.2921 0.17 1400 2.1832 32.6705 11.7333 26.4076 26.4082 18.8017
2.3074 0.18 1500 2.1827 32.7543 11.7787 26.4904 26.4923 18.7827
2.3044 0.2 1600 2.1806 32.8573 11.8672 26.5655 26.5619 18.8097
2.2922 0.21 1700 2.1819 32.8394 11.8158 26.5523 26.5467 18.7891
2.2901 0.22 1800 2.1803 32.7219 11.7493 26.4644 26.4572 18.7882
2.286 0.23 1900 2.1790 32.7474 11.852 26.5078 26.5014 18.7699
2.298 0.25 2000 2.1781 32.8662 11.8878 26.618 26.6174 18.7979
2.2787 0.26 2100 2.1775 32.9621 11.9521 26.6955 26.6914 18.7934
2.2823 0.27 2200 2.1777 33.0633 12.0622 26.7715 26.7597 18.7954
2.2889 0.28 2300 2.1742 32.9637 12.0154 26.6771 26.6721 18.7844
2.2847 0.29 2400 2.1774 32.7435 11.8869 26.5334 26.5306 18.756
2.2923 0.31 2500 2.1754 32.8437 11.8977 26.59 26.587 18.7964
2.2877 0.32 2600 2.1740 32.9137 11.9267 26.618 26.6046 18.7678
2.2976 0.33 2700 2.1728 32.9372 11.9048 26.6412 26.6345 18.7838
2.2935 0.34 2800 2.1719 32.7338 11.7836 26.5667 26.5629 18.7659
2.2622 0.36 2900 2.1718 32.9847 11.978 26.7093 26.7008 18.7627
2.2749 0.37 3000 2.1710 32.9835 11.9809 26.7034 26.6946 18.8016
2.2615 0.38 3100 2.1721 32.9343 11.9317 26.6752 26.6695 18.7689
2.2825 0.39 3200 2.1714 33.022 11.9979 26.7476 26.7402 18.7543

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.12.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train oMateos2020/XSum_t5-small_800_adafactor

Evaluation results