t5-small-thaisum
This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.9925
- Rouge1: 0.1865
- Rouge2: 0.0712
- Rougel: 0.184
- Rougelsum: 0.1834
- Gen Len: 17.465
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 200 | 0.5551 | 0.0154 | 0.0112 | 0.0146 | 0.0148 | 18.8775 |
No log | 2.0 | 400 | 0.5326 | 0.0461 | 0.0322 | 0.0453 | 0.0448 | 18.975 |
0.5806 | 3.0 | 600 | 0.5353 | 0.0297 | 0.0104 | 0.0285 | 0.0283 | 18.8225 |
0.5806 | 4.0 | 800 | 0.5345 | 0.0596 | 0.0396 | 0.0585 | 0.0589 | 18.5025 |
0.5104 | 5.0 | 1000 | 0.5230 | 0.0464 | 0.031 | 0.0458 | 0.0454 | 18.9075 |
0.5104 | 6.0 | 1200 | 0.5138 | 0.0269 | 0.0128 | 0.0255 | 0.0258 | 18.9075 |
0.5104 | 7.0 | 1400 | 0.5186 | 0.0389 | 0.0215 | 0.0379 | 0.0382 | 18.5075 |
0.472 | 8.0 | 1600 | 0.5213 | 0.0478 | 0.0266 | 0.0451 | 0.0454 | 18.6375 |
0.472 | 9.0 | 1800 | 0.5284 | 0.0608 | 0.0286 | 0.0586 | 0.0584 | 18.8225 |
0.4389 | 10.0 | 2000 | 0.5352 | 0.0494 | 0.0272 | 0.0477 | 0.0476 | 18.8925 |
0.4389 | 11.0 | 2200 | 0.5372 | 0.0957 | 0.0477 | 0.0947 | 0.0939 | 18.59 |
0.4389 | 12.0 | 2400 | 0.5380 | 0.0587 | 0.0303 | 0.0582 | 0.0582 | 18.3225 |
0.413 | 13.0 | 2600 | 0.5547 | 0.1026 | 0.0694 | 0.1018 | 0.1017 | 18.2725 |
0.413 | 14.0 | 2800 | 0.5374 | 0.0749 | 0.0368 | 0.0737 | 0.0736 | 18.555 |
0.3963 | 15.0 | 3000 | 0.5469 | 0.0736 | 0.0311 | 0.0729 | 0.073 | 18.5675 |
0.3963 | 16.0 | 3200 | 0.5541 | 0.0685 | 0.0351 | 0.0689 | 0.0687 | 18.6925 |
0.3963 | 17.0 | 3400 | 0.5654 | 0.116 | 0.0654 | 0.1148 | 0.1137 | 18.105 |
0.3711 | 18.0 | 3600 | 0.5666 | 0.0849 | 0.0389 | 0.0852 | 0.0847 | 18.6025 |
0.3711 | 19.0 | 3800 | 0.5739 | 0.1234 | 0.0618 | 0.1205 | 0.1195 | 18.5825 |
0.3572 | 20.0 | 4000 | 0.5605 | 0.1121 | 0.0522 | 0.1124 | 0.1112 | 18.405 |
0.3572 | 21.0 | 4200 | 0.5952 | 0.117 | 0.0508 | 0.1152 | 0.1146 | 18.4525 |
0.3572 | 22.0 | 4400 | 0.5673 | 0.0876 | 0.0385 | 0.0867 | 0.0866 | 18.6025 |
0.3367 | 23.0 | 4600 | 0.5772 | 0.0922 | 0.0319 | 0.093 | 0.0928 | 18.57 |
0.3367 | 24.0 | 4800 | 0.5908 | 0.1137 | 0.0495 | 0.1134 | 0.1136 | 18.0325 |
0.3238 | 25.0 | 5000 | 0.5958 | 0.1197 | 0.051 | 0.118 | 0.1176 | 18.285 |
0.3238 | 26.0 | 5200 | 0.5943 | 0.109 | 0.046 | 0.1064 | 0.1058 | 18.0575 |
0.3238 | 27.0 | 5400 | 0.6297 | 0.154 | 0.0621 | 0.1503 | 0.1489 | 18.06 |
0.3089 | 28.0 | 5600 | 0.6307 | 0.1306 | 0.0494 | 0.1302 | 0.13 | 18.2325 |
0.3089 | 29.0 | 5800 | 0.6342 | 0.1441 | 0.0606 | 0.1413 | 0.1415 | 17.5975 |
0.2965 | 30.0 | 6000 | 0.6426 | 0.156 | 0.0648 | 0.1492 | 0.1488 | 18.2425 |
0.2965 | 31.0 | 6200 | 0.6459 | 0.1533 | 0.0618 | 0.15 | 0.1501 | 17.9225 |
0.2965 | 32.0 | 6400 | 0.6498 | 0.1506 | 0.0582 | 0.1482 | 0.1476 | 18.2 |
0.283 | 33.0 | 6600 | 0.6566 | 0.1609 | 0.0659 | 0.1561 | 0.1555 | 18.0 |
0.283 | 34.0 | 6800 | 0.6616 | 0.1594 | 0.0611 | 0.156 | 0.1559 | 18.08 |
0.2757 | 35.0 | 7000 | 0.6814 | 0.1657 | 0.0712 | 0.1613 | 0.1615 | 18.01 |
0.2757 | 36.0 | 7200 | 0.6919 | 0.1415 | 0.06 | 0.1372 | 0.1374 | 18.285 |
0.2757 | 37.0 | 7400 | 0.6819 | 0.1262 | 0.0501 | 0.126 | 0.1265 | 18.2125 |
0.2598 | 38.0 | 7600 | 0.6896 | 0.1354 | 0.0616 | 0.1336 | 0.1341 | 17.8525 |
0.2598 | 39.0 | 7800 | 0.7094 | 0.1497 | 0.0706 | 0.1478 | 0.1483 | 18.0175 |
0.2508 | 40.0 | 8000 | 0.6995 | 0.1563 | 0.0668 | 0.1547 | 0.155 | 17.9175 |
0.2508 | 41.0 | 8200 | 0.7236 | 0.1544 | 0.065 | 0.1536 | 0.154 | 17.37 |
0.2508 | 42.0 | 8400 | 0.7196 | 0.1759 | 0.0708 | 0.1723 | 0.1723 | 18.1225 |
0.2408 | 43.0 | 8600 | 0.7376 | 0.1692 | 0.0675 | 0.1684 | 0.1694 | 17.6075 |
0.2408 | 44.0 | 8800 | 0.7399 | 0.1715 | 0.0709 | 0.1682 | 0.1697 | 17.475 |
0.235 | 45.0 | 9000 | 0.7446 | 0.1759 | 0.0733 | 0.1724 | 0.1725 | 17.875 |
0.235 | 46.0 | 9200 | 0.7317 | 0.1663 | 0.0763 | 0.1656 | 0.1662 | 17.7375 |
0.235 | 47.0 | 9400 | 0.7472 | 0.1719 | 0.0684 | 0.1682 | 0.1685 | 17.94 |
0.2224 | 48.0 | 9600 | 0.7423 | 0.1743 | 0.0774 | 0.1725 | 0.1722 | 17.88 |
0.2224 | 49.0 | 9800 | 0.7450 | 0.1623 | 0.0723 | 0.1613 | 0.1619 | 17.9675 |
0.219 | 50.0 | 10000 | 0.7754 | 0.1442 | 0.0613 | 0.1437 | 0.1436 | 17.86 |
0.219 | 51.0 | 10200 | 0.7788 | 0.1652 | 0.0762 | 0.1622 | 0.1622 | 17.415 |
0.219 | 52.0 | 10400 | 0.7741 | 0.1655 | 0.0677 | 0.162 | 0.1633 | 17.9625 |
0.209 | 53.0 | 10600 | 0.7954 | 0.1803 | 0.076 | 0.1782 | 0.1784 | 18.0875 |
0.209 | 54.0 | 10800 | 0.8134 | 0.1639 | 0.0758 | 0.1623 | 0.1621 | 17.6575 |
0.2037 | 55.0 | 11000 | 0.8207 | 0.1621 | 0.0665 | 0.1614 | 0.1617 | 17.595 |
0.2037 | 56.0 | 11200 | 0.8326 | 0.1549 | 0.0701 | 0.1549 | 0.1549 | 17.7675 |
0.2037 | 57.0 | 11400 | 0.8264 | 0.1801 | 0.0807 | 0.1781 | 0.1778 | 17.5775 |
0.1962 | 58.0 | 11600 | 0.8321 | 0.167 | 0.0734 | 0.1663 | 0.1658 | 17.7975 |
0.1962 | 59.0 | 11800 | 0.8283 | 0.171 | 0.0713 | 0.1681 | 0.1675 | 17.6725 |
0.1888 | 60.0 | 12000 | 0.8349 | 0.1657 | 0.0761 | 0.1641 | 0.1647 | 17.5625 |
0.1888 | 61.0 | 12200 | 0.8458 | 0.1818 | 0.0799 | 0.1811 | 0.18 | 17.79 |
0.1888 | 62.0 | 12400 | 0.8693 | 0.1979 | 0.0784 | 0.1943 | 0.1938 | 17.52 |
0.1832 | 63.0 | 12600 | 0.8598 | 0.1766 | 0.0812 | 0.1762 | 0.1754 | 17.87 |
0.1832 | 64.0 | 12800 | 0.8845 | 0.1694 | 0.073 | 0.1688 | 0.1682 | 17.5725 |
0.1785 | 65.0 | 13000 | 0.8745 | 0.1909 | 0.0747 | 0.1881 | 0.1881 | 17.51 |
0.1785 | 66.0 | 13200 | 0.8929 | 0.1812 | 0.0715 | 0.1808 | 0.1807 | 17.34 |
0.1785 | 67.0 | 13400 | 0.8821 | 0.1831 | 0.0713 | 0.1813 | 0.1813 | 17.4575 |
0.174 | 68.0 | 13600 | 0.8809 | 0.197 | 0.0771 | 0.1941 | 0.194 | 17.665 |
0.174 | 69.0 | 13800 | 0.8868 | 0.19 | 0.0749 | 0.1859 | 0.1858 | 17.6825 |
0.1697 | 70.0 | 14000 | 0.8952 | 0.1853 | 0.0759 | 0.1816 | 0.1815 | 17.44 |
0.1697 | 71.0 | 14200 | 0.9106 | 0.1878 | 0.0744 | 0.1834 | 0.1835 | 17.4675 |
0.1697 | 72.0 | 14400 | 0.9185 | 0.1883 | 0.0781 | 0.1856 | 0.1863 | 17.5475 |
0.1646 | 73.0 | 14600 | 0.9060 | 0.1809 | 0.0731 | 0.1776 | 0.178 | 17.5725 |
0.1646 | 74.0 | 14800 | 0.9225 | 0.1822 | 0.0707 | 0.1794 | 0.1798 | 17.255 |
0.163 | 75.0 | 15000 | 0.9178 | 0.1759 | 0.0659 | 0.1751 | 0.1755 | 17.5125 |
0.163 | 76.0 | 15200 | 0.9366 | 0.1923 | 0.0738 | 0.1883 | 0.1886 | 17.31 |
0.163 | 77.0 | 15400 | 0.9387 | 0.1884 | 0.0699 | 0.1852 | 0.1855 | 17.4875 |
0.1582 | 78.0 | 15600 | 0.9315 | 0.1869 | 0.0773 | 0.1831 | 0.1834 | 17.5775 |
0.1582 | 79.0 | 15800 | 0.9339 | 0.19 | 0.0738 | 0.1862 | 0.1877 | 17.6125 |
0.1537 | 80.0 | 16000 | 0.9501 | 0.192 | 0.0744 | 0.1894 | 0.1889 | 17.1625 |
0.1537 | 81.0 | 16200 | 0.9403 | 0.1874 | 0.074 | 0.1844 | 0.1845 | 17.5075 |
0.1537 | 82.0 | 16400 | 0.9448 | 0.1919 | 0.0771 | 0.1889 | 0.1893 | 17.615 |
0.1528 | 83.0 | 16600 | 0.9521 | 0.1924 | 0.0795 | 0.1902 | 0.1908 | 17.685 |
0.1528 | 84.0 | 16800 | 0.9518 | 0.1974 | 0.0844 | 0.1944 | 0.1956 | 17.61 |
0.1484 | 85.0 | 17000 | 0.9580 | 0.2139 | 0.0815 | 0.2105 | 0.2108 | 17.645 |
0.1484 | 86.0 | 17200 | 0.9626 | 0.1871 | 0.0719 | 0.1849 | 0.1846 | 17.5075 |
0.1484 | 87.0 | 17400 | 0.9746 | 0.1985 | 0.0795 | 0.1958 | 0.1957 | 17.47 |
0.144 | 88.0 | 17600 | 0.9726 | 0.1947 | 0.0771 | 0.1917 | 0.1913 | 17.4475 |
0.144 | 89.0 | 17800 | 0.9757 | 0.1986 | 0.0804 | 0.1954 | 0.1946 | 17.4775 |
0.1448 | 90.0 | 18000 | 0.9788 | 0.1989 | 0.0826 | 0.1961 | 0.1945 | 17.4775 |
0.1448 | 91.0 | 18200 | 0.9776 | 0.1881 | 0.0742 | 0.1855 | 0.1847 | 17.4275 |
0.1448 | 92.0 | 18400 | 0.9791 | 0.186 | 0.0698 | 0.183 | 0.1827 | 17.5 |
0.1417 | 93.0 | 18600 | 0.9812 | 0.1917 | 0.0746 | 0.1891 | 0.1884 | 17.53 |
0.1417 | 94.0 | 18800 | 0.9913 | 0.1937 | 0.0786 | 0.1912 | 0.1905 | 17.4125 |
0.1399 | 95.0 | 19000 | 0.9935 | 0.1942 | 0.0814 | 0.1917 | 0.1912 | 17.5425 |
0.1399 | 96.0 | 19200 | 0.9887 | 0.1909 | 0.0686 | 0.1882 | 0.1879 | 17.49 |
0.1399 | 97.0 | 19400 | 0.9899 | 0.1921 | 0.0745 | 0.1897 | 0.1888 | 17.4275 |
0.1399 | 98.0 | 19600 | 0.9910 | 0.1909 | 0.0747 | 0.1879 | 0.187 | 17.4225 |
0.1399 | 99.0 | 19800 | 0.9922 | 0.1889 | 0.0738 | 0.1864 | 0.186 | 17.445 |
0.1392 | 100.0 | 20000 | 0.9925 | 0.1865 | 0.0712 | 0.184 | 0.1834 | 17.465 |
Framework versions
- Transformers 4.29.1
- Pytorch 2.0.0+cu118
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.