t5-small-finetuned-xsum
This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2889
- Rouge1: 39.8328
- Rouge2: 22.4239
- Rougel: 39.9834
- Rougelsum: 39.9724
- Gen Len: 16.0805
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 70 | 0.2976 | 38.8006 | 21.9111 | 38.9213 | 38.9522 | 16.0847 |
No log | 2.0 | 140 | 0.2965 | 38.5928 | 21.9111 | 38.7429 | 38.6634 | 16.1017 |
No log | 3.0 | 210 | 0.2986 | 39.1757 | 22.0568 | 39.2841 | 39.3331 | 16.0847 |
No log | 4.0 | 280 | 0.2962 | 39.4362 | 21.9778 | 39.4592 | 39.5481 | 16.0847 |
No log | 5.0 | 350 | 0.2991 | 39.4154 | 22.3903 | 39.5864 | 39.574 | 16.0932 |
No log | 6.0 | 420 | 0.2973 | 39.5908 | 22.5913 | 39.7026 | 39.7583 | 16.0932 |
No log | 7.0 | 490 | 0.2973 | 39.6087 | 22.5315 | 39.7469 | 39.7952 | 16.0847 |
0.2763 | 8.0 | 560 | 0.2976 | 39.5607 | 22.5913 | 39.6733 | 39.7155 | 16.1102 |
0.2763 | 9.0 | 630 | 0.2986 | 38.7333 | 22.0903 | 38.8746 | 38.7966 | 16.0847 |
0.2763 | 10.0 | 700 | 0.2954 | 39.358 | 22.0903 | 39.4452 | 39.5052 | 16.1017 |
0.2763 | 11.0 | 770 | 0.2963 | 38.7295 | 21.9111 | 38.8486 | 38.7888 | 16.0847 |
0.2763 | 12.0 | 840 | 0.2950 | 38.6733 | 22.0903 | 38.868 | 38.8268 | 16.0847 |
0.2763 | 13.0 | 910 | 0.2969 | 39.2337 | 22.0903 | 39.3166 | 39.3686 | 16.0847 |
0.2763 | 14.0 | 980 | 0.2943 | 39.3254 | 22.2055 | 39.4837 | 39.5325 | 16.0678 |
0.2694 | 15.0 | 1050 | 0.2939 | 39.1597 | 21.9323 | 39.2799 | 39.2837 | 16.0847 |
0.2694 | 16.0 | 1120 | 0.2942 | 39.4126 | 22.0128 | 39.5265 | 39.5714 | 16.0847 |
0.2694 | 17.0 | 1190 | 0.2971 | 39.7798 | 22.728 | 39.9793 | 40.0021 | 16.0847 |
0.2694 | 18.0 | 1260 | 0.2956 | 39.701 | 22.5979 | 39.8913 | 39.929 | 16.0847 |
0.2694 | 19.0 | 1330 | 0.2945 | 39.6161 | 22.2776 | 39.7607 | 39.84 | 16.089 |
0.2694 | 20.0 | 1400 | 0.2947 | 39.4039 | 22.3206 | 39.5051 | 39.5487 | 16.0847 |
0.2694 | 21.0 | 1470 | 0.2945 | 39.7001 | 22.5777 | 39.8644 | 39.9219 | 16.0847 |
0.2632 | 22.0 | 1540 | 0.2943 | 39.4609 | 22.1144 | 39.5673 | 39.5833 | 16.089 |
0.2632 | 23.0 | 1610 | 0.2946 | 39.5428 | 22.3434 | 39.6705 | 39.7451 | 16.0847 |
0.2632 | 24.0 | 1680 | 0.2946 | 39.0183 | 22.0903 | 39.1424 | 39.1981 | 16.0847 |
0.2632 | 25.0 | 1750 | 0.2955 | 39.2686 | 22.4778 | 39.3774 | 39.3981 | 16.0847 |
0.2632 | 26.0 | 1820 | 0.2955 | 38.7063 | 21.936 | 38.8518 | 38.8452 | 16.072 |
0.2632 | 27.0 | 1890 | 0.2943 | 39.2686 | 22.5219 | 39.3774 | 39.3981 | 16.0678 |
0.2632 | 28.0 | 1960 | 0.2919 | 39.7635 | 22.5543 | 39.9236 | 40.0061 | 16.0932 |
0.258 | 29.0 | 2030 | 0.2911 | 39.7166 | 22.5549 | 39.9268 | 39.9832 | 16.0678 |
0.258 | 30.0 | 2100 | 0.2905 | 39.4208 | 22.6958 | 39.5539 | 39.5859 | 16.0763 |
0.258 | 31.0 | 2170 | 0.2917 | 39.4279 | 22.6251 | 39.575 | 39.5743 | 16.0763 |
0.258 | 32.0 | 2240 | 0.2904 | 39.6284 | 22.3963 | 39.8328 | 39.9046 | 16.0678 |
0.258 | 33.0 | 2310 | 0.2937 | 39.3461 | 22.5787 | 39.5004 | 39.4931 | 16.0678 |
0.258 | 34.0 | 2380 | 0.2897 | 39.3954 | 22.5787 | 39.5232 | 39.5061 | 16.0678 |
0.258 | 35.0 | 2450 | 0.2924 | 39.4158 | 22.5787 | 39.6149 | 39.5978 | 16.0678 |
0.2523 | 36.0 | 2520 | 0.2927 | 39.5072 | 22.702 | 39.6958 | 39.6845 | 16.0678 |
0.2523 | 37.0 | 2590 | 0.2928 | 39.4158 | 22.5787 | 39.6149 | 39.5978 | 16.0678 |
0.2523 | 38.0 | 2660 | 0.2899 | 39.4397 | 22.4174 | 39.5664 | 39.6141 | 16.072 |
0.2523 | 39.0 | 2730 | 0.2917 | 39.5985 | 22.5787 | 39.7005 | 39.7516 | 16.0932 |
0.2523 | 40.0 | 2800 | 0.2920 | 39.4158 | 22.5787 | 39.6149 | 39.5978 | 16.0763 |
0.2523 | 41.0 | 2870 | 0.2898 | 39.3254 | 22.6509 | 39.4715 | 39.508 | 16.0678 |
0.2523 | 42.0 | 2940 | 0.2913 | 39.4994 | 22.8155 | 39.6122 | 39.5795 | 16.0847 |
0.2489 | 43.0 | 3010 | 0.2902 | 39.1453 | 22.5787 | 39.3306 | 39.2342 | 16.0847 |
0.2489 | 44.0 | 3080 | 0.2903 | 39.778 | 22.8155 | 39.856 | 39.891 | 16.0932 |
0.2489 | 45.0 | 3150 | 0.2896 | 39.3954 | 22.5787 | 39.5232 | 39.5061 | 16.0678 |
0.2489 | 46.0 | 3220 | 0.2899 | 39.659 | 22.8155 | 39.7524 | 39.7486 | 16.0847 |
0.2489 | 47.0 | 3290 | 0.2887 | 39.538 | 22.8155 | 39.6348 | 39.6385 | 16.0847 |
0.2489 | 48.0 | 3360 | 0.2905 | 39.538 | 22.8155 | 39.6348 | 39.6385 | 16.0678 |
0.2489 | 49.0 | 3430 | 0.2888 | 39.6043 | 22.5446 | 39.6811 | 39.6975 | 16.072 |
0.2442 | 50.0 | 3500 | 0.2905 | 39.6579 | 22.8155 | 39.7436 | 39.7544 | 16.0678 |
0.2442 | 51.0 | 3570 | 0.2917 | 39.6978 | 22.8155 | 39.7623 | 39.7429 | 16.0678 |
0.2442 | 52.0 | 3640 | 0.2886 | 39.5685 | 22.6587 | 39.6505 | 39.6604 | 16.0636 |
0.2442 | 53.0 | 3710 | 0.2893 | 39.6489 | 22.8155 | 39.7248 | 39.7425 | 16.0847 |
0.2442 | 54.0 | 3780 | 0.2910 | 39.6489 | 22.8155 | 39.7248 | 39.7425 | 16.0678 |
0.2442 | 55.0 | 3850 | 0.2900 | 39.7014 | 22.8155 | 39.7836 | 39.807 | 16.072 |
0.2442 | 56.0 | 3920 | 0.2893 | 39.7156 | 22.8155 | 39.8059 | 39.7891 | 16.072 |
0.2442 | 57.0 | 3990 | 0.2893 | 39.6579 | 22.8155 | 39.7436 | 39.7544 | 16.0847 |
0.2406 | 58.0 | 4060 | 0.2890 | 39.3975 | 22.1901 | 39.4816 | 39.5241 | 16.0763 |
0.2406 | 59.0 | 4130 | 0.2883 | 39.6046 | 22.2588 | 39.7636 | 39.7933 | 16.072 |
0.2406 | 60.0 | 4200 | 0.2895 | 39.8147 | 22.8155 | 39.8815 | 39.9257 | 16.0847 |
0.2406 | 61.0 | 4270 | 0.2900 | 39.6523 | 22.6587 | 39.7435 | 39.7004 | 16.072 |
0.2406 | 62.0 | 4340 | 0.2876 | 39.4672 | 22.6587 | 39.5554 | 39.533 | 16.072 |
0.2406 | 63.0 | 4410 | 0.2872 | 39.3354 | 22.4499 | 39.3997 | 39.4517 | 16.0636 |
0.2406 | 64.0 | 4480 | 0.2898 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2379 | 65.0 | 4550 | 0.2897 | 39.4043 | 22.4499 | 39.5828 | 39.5649 | 16.072 |
0.2379 | 66.0 | 4620 | 0.2897 | 39.7377 | 22.3954 | 39.8376 | 39.8243 | 16.0763 |
0.2379 | 67.0 | 4690 | 0.2898 | 39.5873 | 22.1218 | 39.7298 | 39.6907 | 16.0763 |
0.2379 | 68.0 | 4760 | 0.2889 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2379 | 69.0 | 4830 | 0.2901 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2379 | 70.0 | 4900 | 0.2889 | 39.6293 | 22.1218 | 39.7629 | 39.7467 | 16.0763 |
0.2379 | 71.0 | 4970 | 0.2888 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2354 | 72.0 | 5040 | 0.2891 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2354 | 73.0 | 5110 | 0.2893 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0847 |
0.2354 | 74.0 | 5180 | 0.2897 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2354 | 75.0 | 5250 | 0.2894 | 39.6293 | 22.1218 | 39.7629 | 39.7467 | 16.0847 |
0.2354 | 76.0 | 5320 | 0.2892 | 39.6293 | 22.1218 | 39.7629 | 39.7467 | 16.0847 |
0.2354 | 77.0 | 5390 | 0.2893 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0847 |
0.2354 | 78.0 | 5460 | 0.2885 | 39.6293 | 22.1218 | 39.7629 | 39.7467 | 16.0847 |
0.2337 | 79.0 | 5530 | 0.2891 | 39.6293 | 22.1218 | 39.7629 | 39.7467 | 16.0763 |
0.2337 | 80.0 | 5600 | 0.2888 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2337 | 81.0 | 5670 | 0.2885 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2337 | 82.0 | 5740 | 0.2889 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2337 | 83.0 | 5810 | 0.2886 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2337 | 84.0 | 5880 | 0.2894 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2337 | 85.0 | 5950 | 0.2889 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2318 | 86.0 | 6020 | 0.2885 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2318 | 87.0 | 6090 | 0.2887 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2318 | 88.0 | 6160 | 0.2883 | 39.5053 | 22.1218 | 39.6356 | 39.5945 | 16.0763 |
0.2318 | 89.0 | 6230 | 0.2880 | 39.4264 | 22.1218 | 39.5743 | 39.5557 | 16.0763 |
0.2318 | 90.0 | 6300 | 0.2883 | 39.7689 | 22.4239 | 39.9782 | 39.9153 | 16.0805 |
0.2318 | 91.0 | 6370 | 0.2886 | 39.7689 | 22.4239 | 39.9782 | 39.9153 | 16.0805 |
0.2318 | 92.0 | 6440 | 0.2887 | 39.8328 | 22.4239 | 39.9834 | 39.9724 | 16.0805 |
0.2325 | 93.0 | 6510 | 0.2884 | 39.9004 | 22.4239 | 40.1203 | 40.0835 | 16.072 |
0.2325 | 94.0 | 6580 | 0.2886 | 39.8328 | 22.4239 | 39.9834 | 39.9724 | 16.072 |
0.2325 | 95.0 | 6650 | 0.2890 | 39.8328 | 22.4239 | 39.9834 | 39.9724 | 16.0805 |
0.2325 | 96.0 | 6720 | 0.2889 | 39.8328 | 22.4239 | 39.9834 | 39.9724 | 16.0805 |
0.2325 | 97.0 | 6790 | 0.2889 | 39.8328 | 22.4239 | 39.9834 | 39.9724 | 16.0805 |
0.2325 | 98.0 | 6860 | 0.2889 | 39.8328 | 22.4239 | 39.9834 | 39.9724 | 16.0805 |
0.2325 | 99.0 | 6930 | 0.2889 | 39.8328 | 22.4239 | 39.9834 | 39.9724 | 16.0805 |
0.2303 | 100.0 | 7000 | 0.2889 | 39.8328 | 22.4239 | 39.9834 | 39.9724 | 16.0805 |
Framework versions
- Transformers 4.46.2
- Pytorch 2.5.1+cu121
- Tokenizers 0.20.3
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for guan06/t5-small-finetuned-xsum
Base model
google-t5/t5-small