t5-small-finetuned-xsum

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2889
  • Rouge1: 39.8328
  • Rouge2: 22.4239
  • Rougel: 39.9834
  • Rougelsum: 39.9724
  • Gen Len: 16.0805

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 70 0.2976 38.8006 21.9111 38.9213 38.9522 16.0847
No log 2.0 140 0.2965 38.5928 21.9111 38.7429 38.6634 16.1017
No log 3.0 210 0.2986 39.1757 22.0568 39.2841 39.3331 16.0847
No log 4.0 280 0.2962 39.4362 21.9778 39.4592 39.5481 16.0847
No log 5.0 350 0.2991 39.4154 22.3903 39.5864 39.574 16.0932
No log 6.0 420 0.2973 39.5908 22.5913 39.7026 39.7583 16.0932
No log 7.0 490 0.2973 39.6087 22.5315 39.7469 39.7952 16.0847
0.2763 8.0 560 0.2976 39.5607 22.5913 39.6733 39.7155 16.1102
0.2763 9.0 630 0.2986 38.7333 22.0903 38.8746 38.7966 16.0847
0.2763 10.0 700 0.2954 39.358 22.0903 39.4452 39.5052 16.1017
0.2763 11.0 770 0.2963 38.7295 21.9111 38.8486 38.7888 16.0847
0.2763 12.0 840 0.2950 38.6733 22.0903 38.868 38.8268 16.0847
0.2763 13.0 910 0.2969 39.2337 22.0903 39.3166 39.3686 16.0847
0.2763 14.0 980 0.2943 39.3254 22.2055 39.4837 39.5325 16.0678
0.2694 15.0 1050 0.2939 39.1597 21.9323 39.2799 39.2837 16.0847
0.2694 16.0 1120 0.2942 39.4126 22.0128 39.5265 39.5714 16.0847
0.2694 17.0 1190 0.2971 39.7798 22.728 39.9793 40.0021 16.0847
0.2694 18.0 1260 0.2956 39.701 22.5979 39.8913 39.929 16.0847
0.2694 19.0 1330 0.2945 39.6161 22.2776 39.7607 39.84 16.089
0.2694 20.0 1400 0.2947 39.4039 22.3206 39.5051 39.5487 16.0847
0.2694 21.0 1470 0.2945 39.7001 22.5777 39.8644 39.9219 16.0847
0.2632 22.0 1540 0.2943 39.4609 22.1144 39.5673 39.5833 16.089
0.2632 23.0 1610 0.2946 39.5428 22.3434 39.6705 39.7451 16.0847
0.2632 24.0 1680 0.2946 39.0183 22.0903 39.1424 39.1981 16.0847
0.2632 25.0 1750 0.2955 39.2686 22.4778 39.3774 39.3981 16.0847
0.2632 26.0 1820 0.2955 38.7063 21.936 38.8518 38.8452 16.072
0.2632 27.0 1890 0.2943 39.2686 22.5219 39.3774 39.3981 16.0678
0.2632 28.0 1960 0.2919 39.7635 22.5543 39.9236 40.0061 16.0932
0.258 29.0 2030 0.2911 39.7166 22.5549 39.9268 39.9832 16.0678
0.258 30.0 2100 0.2905 39.4208 22.6958 39.5539 39.5859 16.0763
0.258 31.0 2170 0.2917 39.4279 22.6251 39.575 39.5743 16.0763
0.258 32.0 2240 0.2904 39.6284 22.3963 39.8328 39.9046 16.0678
0.258 33.0 2310 0.2937 39.3461 22.5787 39.5004 39.4931 16.0678
0.258 34.0 2380 0.2897 39.3954 22.5787 39.5232 39.5061 16.0678
0.258 35.0 2450 0.2924 39.4158 22.5787 39.6149 39.5978 16.0678
0.2523 36.0 2520 0.2927 39.5072 22.702 39.6958 39.6845 16.0678
0.2523 37.0 2590 0.2928 39.4158 22.5787 39.6149 39.5978 16.0678
0.2523 38.0 2660 0.2899 39.4397 22.4174 39.5664 39.6141 16.072
0.2523 39.0 2730 0.2917 39.5985 22.5787 39.7005 39.7516 16.0932
0.2523 40.0 2800 0.2920 39.4158 22.5787 39.6149 39.5978 16.0763
0.2523 41.0 2870 0.2898 39.3254 22.6509 39.4715 39.508 16.0678
0.2523 42.0 2940 0.2913 39.4994 22.8155 39.6122 39.5795 16.0847
0.2489 43.0 3010 0.2902 39.1453 22.5787 39.3306 39.2342 16.0847
0.2489 44.0 3080 0.2903 39.778 22.8155 39.856 39.891 16.0932
0.2489 45.0 3150 0.2896 39.3954 22.5787 39.5232 39.5061 16.0678
0.2489 46.0 3220 0.2899 39.659 22.8155 39.7524 39.7486 16.0847
0.2489 47.0 3290 0.2887 39.538 22.8155 39.6348 39.6385 16.0847
0.2489 48.0 3360 0.2905 39.538 22.8155 39.6348 39.6385 16.0678
0.2489 49.0 3430 0.2888 39.6043 22.5446 39.6811 39.6975 16.072
0.2442 50.0 3500 0.2905 39.6579 22.8155 39.7436 39.7544 16.0678
0.2442 51.0 3570 0.2917 39.6978 22.8155 39.7623 39.7429 16.0678
0.2442 52.0 3640 0.2886 39.5685 22.6587 39.6505 39.6604 16.0636
0.2442 53.0 3710 0.2893 39.6489 22.8155 39.7248 39.7425 16.0847
0.2442 54.0 3780 0.2910 39.6489 22.8155 39.7248 39.7425 16.0678
0.2442 55.0 3850 0.2900 39.7014 22.8155 39.7836 39.807 16.072
0.2442 56.0 3920 0.2893 39.7156 22.8155 39.8059 39.7891 16.072
0.2442 57.0 3990 0.2893 39.6579 22.8155 39.7436 39.7544 16.0847
0.2406 58.0 4060 0.2890 39.3975 22.1901 39.4816 39.5241 16.0763
0.2406 59.0 4130 0.2883 39.6046 22.2588 39.7636 39.7933 16.072
0.2406 60.0 4200 0.2895 39.8147 22.8155 39.8815 39.9257 16.0847
0.2406 61.0 4270 0.2900 39.6523 22.6587 39.7435 39.7004 16.072
0.2406 62.0 4340 0.2876 39.4672 22.6587 39.5554 39.533 16.072
0.2406 63.0 4410 0.2872 39.3354 22.4499 39.3997 39.4517 16.0636
0.2406 64.0 4480 0.2898 39.5053 22.1218 39.6356 39.5945 16.0763
0.2379 65.0 4550 0.2897 39.4043 22.4499 39.5828 39.5649 16.072
0.2379 66.0 4620 0.2897 39.7377 22.3954 39.8376 39.8243 16.0763
0.2379 67.0 4690 0.2898 39.5873 22.1218 39.7298 39.6907 16.0763
0.2379 68.0 4760 0.2889 39.5053 22.1218 39.6356 39.5945 16.0763
0.2379 69.0 4830 0.2901 39.5053 22.1218 39.6356 39.5945 16.0763
0.2379 70.0 4900 0.2889 39.6293 22.1218 39.7629 39.7467 16.0763
0.2379 71.0 4970 0.2888 39.5053 22.1218 39.6356 39.5945 16.0763
0.2354 72.0 5040 0.2891 39.5053 22.1218 39.6356 39.5945 16.0763
0.2354 73.0 5110 0.2893 39.5053 22.1218 39.6356 39.5945 16.0847
0.2354 74.0 5180 0.2897 39.5053 22.1218 39.6356 39.5945 16.0763
0.2354 75.0 5250 0.2894 39.6293 22.1218 39.7629 39.7467 16.0847
0.2354 76.0 5320 0.2892 39.6293 22.1218 39.7629 39.7467 16.0847
0.2354 77.0 5390 0.2893 39.5053 22.1218 39.6356 39.5945 16.0847
0.2354 78.0 5460 0.2885 39.6293 22.1218 39.7629 39.7467 16.0847
0.2337 79.0 5530 0.2891 39.6293 22.1218 39.7629 39.7467 16.0763
0.2337 80.0 5600 0.2888 39.5053 22.1218 39.6356 39.5945 16.0763
0.2337 81.0 5670 0.2885 39.5053 22.1218 39.6356 39.5945 16.0763
0.2337 82.0 5740 0.2889 39.5053 22.1218 39.6356 39.5945 16.0763
0.2337 83.0 5810 0.2886 39.5053 22.1218 39.6356 39.5945 16.0763
0.2337 84.0 5880 0.2894 39.5053 22.1218 39.6356 39.5945 16.0763
0.2337 85.0 5950 0.2889 39.5053 22.1218 39.6356 39.5945 16.0763
0.2318 86.0 6020 0.2885 39.5053 22.1218 39.6356 39.5945 16.0763
0.2318 87.0 6090 0.2887 39.5053 22.1218 39.6356 39.5945 16.0763
0.2318 88.0 6160 0.2883 39.5053 22.1218 39.6356 39.5945 16.0763
0.2318 89.0 6230 0.2880 39.4264 22.1218 39.5743 39.5557 16.0763
0.2318 90.0 6300 0.2883 39.7689 22.4239 39.9782 39.9153 16.0805
0.2318 91.0 6370 0.2886 39.7689 22.4239 39.9782 39.9153 16.0805
0.2318 92.0 6440 0.2887 39.8328 22.4239 39.9834 39.9724 16.0805
0.2325 93.0 6510 0.2884 39.9004 22.4239 40.1203 40.0835 16.072
0.2325 94.0 6580 0.2886 39.8328 22.4239 39.9834 39.9724 16.072
0.2325 95.0 6650 0.2890 39.8328 22.4239 39.9834 39.9724 16.0805
0.2325 96.0 6720 0.2889 39.8328 22.4239 39.9834 39.9724 16.0805
0.2325 97.0 6790 0.2889 39.8328 22.4239 39.9834 39.9724 16.0805
0.2325 98.0 6860 0.2889 39.8328 22.4239 39.9834 39.9724 16.0805
0.2325 99.0 6930 0.2889 39.8328 22.4239 39.9834 39.9724 16.0805
0.2303 100.0 7000 0.2889 39.8328 22.4239 39.9834 39.9724 16.0805

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu121
  • Tokenizers 0.20.3
Downloads last month
6
Safetensors
Model size
60.5M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for guan06/t5-small-finetuned-xsum

Base model

google-t5/t5-small
Finetuned
(1807)
this model