flan-t5-small-gen-chat_v3
This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.9786
- Rouge 1: 9.6963
- Rouge 2: 1.3933
- Rouge L: 8.9954
- Avg Len: 12.0037
- Bertscore Prec: 0.8643
- Bertscore Rec: 0.855
- Bertscore F1: 0.8593
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge 1 | Rouge 2 | Rouge L | Avg Len | Bertscore Prec | Bertscore Rec | Bertscore F1 |
---|---|---|---|---|---|---|---|---|---|---|
4.1849 | 0.0886 | 200 | 3.8165 | 5.8097 | 0.3202 | 5.3973 | 11.8954 | 0.8348 | 0.8322 | 0.8332 |
3.9771 | 0.1771 | 400 | 3.6720 | 6.0081 | 0.3853 | 5.6136 | 13.2256 | 0.8375 | 0.8334 | 0.8352 |
3.8582 | 0.2657 | 600 | 3.5856 | 5.6377 | 0.3018 | 5.3033 | 12.7928 | 0.8417 | 0.8362 | 0.8385 |
3.7999 | 0.3543 | 800 | 3.5083 | 5.5186 | 0.2521 | 5.1755 | 13.0536 | 0.8446 | 0.8406 | 0.8422 |
3.7116 | 0.4429 | 1000 | 3.4452 | 4.8599 | 0.2578 | 4.6238 | 12.0016 | 0.851 | 0.8433 | 0.8467 |
3.6357 | 0.5314 | 1200 | 3.3819 | 5.4195 | 0.2351 | 5.0996 | 12.1052 | 0.8559 | 0.8465 | 0.8508 |
3.589 | 0.6200 | 1400 | 3.3247 | 5.6178 | 0.3175 | 5.3152 | 12.0741 | 0.8597 | 0.8475 | 0.8532 |
3.5486 | 0.7086 | 1600 | 3.2654 | 5.9822 | 0.306 | 5.7237 | 13.2056 | 0.8597 | 0.8481 | 0.8534 |
3.4733 | 0.7972 | 1800 | 3.2081 | 6.5603 | 0.3579 | 6.1843 | 12.4201 | 0.8588 | 0.8485 | 0.8532 |
3.4067 | 0.8857 | 2000 | 3.1521 | 6.971 | 0.4266 | 6.5181 | 12.4206 | 0.8615 | 0.8509 | 0.8558 |
3.3822 | 0.9743 | 2200 | 3.1045 | 6.4559 | 0.4137 | 6.1487 | 13.4043 | 0.8629 | 0.8506 | 0.8563 |
3.3336 | 1.0629 | 2400 | 3.0578 | 7.2074 | 0.4162 | 6.7473 | 13.5994 | 0.8614 | 0.8521 | 0.8564 |
3.2777 | 1.1515 | 2600 | 3.0174 | 7.3221 | 0.42 | 6.8584 | 13.6125 | 0.8625 | 0.8524 | 0.857 |
3.2434 | 1.2400 | 2800 | 2.9788 | 6.6126 | 0.473 | 6.2565 | 13.4753 | 0.8612 | 0.851 | 0.8556 |
3.19 | 1.3286 | 3000 | 2.9445 | 7.3159 | 0.4829 | 6.8595 | 13.244 | 0.8637 | 0.8527 | 0.8578 |
3.1548 | 1.4172 | 3200 | 2.9119 | 7.5114 | 0.5239 | 6.9946 | 12.8333 | 0.867 | 0.8523 | 0.8593 |
3.1495 | 1.5058 | 3400 | 2.8816 | 7.6945 | 0.5481 | 7.205 | 12.9038 | 0.8656 | 0.8527 | 0.8587 |
3.1304 | 1.5943 | 3600 | 2.8490 | 7.5919 | 0.5784 | 7.1099 | 12.6851 | 0.8661 | 0.8532 | 0.8593 |
3.1161 | 1.6829 | 3800 | 2.8193 | 7.5826 | 0.606 | 7.0534 | 13.2713 | 0.8632 | 0.8526 | 0.8575 |
3.0728 | 1.7715 | 4000 | 2.7908 | 7.9963 | 0.6485 | 7.4816 | 13.1178 | 0.8633 | 0.854 | 0.8583 |
3.0435 | 1.8601 | 4200 | 2.7619 | 7.7297 | 0.6519 | 7.1663 | 13.5631 | 0.8624 | 0.8537 | 0.8577 |
3.0435 | 1.9486 | 4400 | 2.7386 | 8.1558 | 0.6462 | 7.5864 | 12.4606 | 0.8652 | 0.8538 | 0.8591 |
2.9904 | 2.0372 | 4600 | 2.7115 | 8.2389 | 0.6351 | 7.6214 | 13.1546 | 0.8637 | 0.8541 | 0.8585 |
2.9561 | 2.1258 | 4800 | 2.6842 | 8.0952 | 0.6869 | 7.4859 | 13.1651 | 0.865 | 0.8546 | 0.8594 |
2.9581 | 2.2143 | 5000 | 2.6625 | 8.3428 | 0.6576 | 7.7467 | 12.1782 | 0.865 | 0.8535 | 0.8589 |
2.9205 | 2.3029 | 5200 | 2.6386 | 8.4692 | 0.6628 | 7.8614 | 12.8927 | 0.8661 | 0.8541 | 0.8597 |
2.8909 | 2.3915 | 5400 | 2.6168 | 8.9372 | 0.7864 | 8.2121 | 13.0857 | 0.8641 | 0.8551 | 0.8592 |
2.8943 | 2.4801 | 5600 | 2.5921 | 8.6149 | 0.7761 | 7.9657 | 13.0762 | 0.8641 | 0.8548 | 0.8591 |
2.8807 | 2.5686 | 5800 | 2.5723 | 8.7656 | 0.8198 | 8.115 | 13.1309 | 0.8655 | 0.855 | 0.8599 |
2.8698 | 2.6572 | 6000 | 2.5483 | 8.7496 | 0.7512 | 8.1781 | 12.4621 | 0.8636 | 0.8546 | 0.8587 |
2.8375 | 2.7458 | 6200 | 2.5314 | 8.3957 | 0.811 | 7.8299 | 12.3601 | 0.864 | 0.8539 | 0.8586 |
2.8169 | 2.8344 | 6400 | 2.5110 | 8.4644 | 0.8284 | 7.897 | 12.4501 | 0.8644 | 0.8543 | 0.859 |
2.8047 | 2.9229 | 6600 | 2.4917 | 9.0923 | 0.8458 | 8.417 | 12.9096 | 0.8652 | 0.8551 | 0.8598 |
2.7935 | 3.0115 | 6800 | 2.4725 | 8.8998 | 0.9062 | 8.2578 | 13.0 | 0.8639 | 0.8551 | 0.8592 |
2.7408 | 3.1001 | 7000 | 2.4529 | 8.8207 | 0.8174 | 8.2216 | 12.2455 | 0.8653 | 0.8545 | 0.8595 |
2.7551 | 3.1887 | 7200 | 2.4364 | 8.8831 | 0.8326 | 8.2174 | 12.6246 | 0.8638 | 0.8543 | 0.8587 |
2.7104 | 3.2772 | 7400 | 2.4171 | 9.1268 | 0.935 | 8.4039 | 12.2392 | 0.8635 | 0.8545 | 0.8586 |
2.7175 | 3.3658 | 7600 | 2.4051 | 8.8107 | 0.8979 | 8.2077 | 12.1488 | 0.8651 | 0.8548 | 0.8596 |
2.7037 | 3.4544 | 7800 | 2.3833 | 8.4842 | 0.8216 | 7.9163 | 12.3849 | 0.8655 | 0.854 | 0.8594 |
2.6905 | 3.5430 | 8000 | 2.3658 | 8.6487 | 0.9095 | 8.0252 | 12.4101 | 0.864 | 0.854 | 0.8586 |
2.6822 | 3.6315 | 8200 | 2.3522 | 8.4983 | 0.8329 | 7.943 | 12.9374 | 0.865 | 0.8541 | 0.8592 |
2.6711 | 3.7201 | 8400 | 2.3368 | 8.8067 | 0.9028 | 8.2338 | 12.979 | 0.8643 | 0.8543 | 0.859 |
2.6793 | 3.8087 | 8600 | 2.3195 | 8.8736 | 0.9553 | 8.2529 | 12.5521 | 0.864 | 0.8544 | 0.8588 |
2.6567 | 3.8973 | 8800 | 2.3052 | 9.3126 | 1.058 | 8.5879 | 12.4516 | 0.8651 | 0.8547 | 0.8595 |
2.6376 | 3.9858 | 9000 | 2.2933 | 9.2685 | 1.1461 | 8.6109 | 11.9085 | 0.8665 | 0.8551 | 0.8604 |
2.6161 | 4.0744 | 9200 | 2.2750 | 9.131 | 1.0839 | 8.4462 | 11.9385 | 0.8653 | 0.8553 | 0.8599 |
2.6107 | 4.1630 | 9400 | 2.2635 | 8.8956 | 1.0494 | 8.2891 | 11.817 | 0.8664 | 0.8547 | 0.8601 |
2.5926 | 4.2516 | 9600 | 2.2494 | 9.1648 | 1.006 | 8.5539 | 12.0084 | 0.8652 | 0.8545 | 0.8594 |
2.5863 | 4.3401 | 9800 | 2.2380 | 9.3821 | 1.1255 | 8.8075 | 11.6199 | 0.8677 | 0.8548 | 0.8609 |
2.5654 | 4.4287 | 10000 | 2.2242 | 9.1769 | 1.0914 | 8.5615 | 12.0484 | 0.8647 | 0.8547 | 0.8593 |
2.5592 | 4.5173 | 10200 | 2.2109 | 9.5718 | 1.2002 | 8.8977 | 12.091 | 0.8666 | 0.8551 | 0.8605 |
2.5595 | 4.6058 | 10400 | 2.1977 | 9.6277 | 1.1425 | 8.928 | 11.8028 | 0.8664 | 0.8546 | 0.8601 |
2.5367 | 4.6944 | 10600 | 2.1860 | 9.9977 | 1.2729 | 9.2575 | 11.3985 | 0.8661 | 0.8549 | 0.8601 |
2.5426 | 4.7830 | 10800 | 2.1752 | 9.9377 | 1.1987 | 9.165 | 11.8912 | 0.8657 | 0.8553 | 0.8601 |
2.5285 | 4.8716 | 11000 | 2.1652 | 9.7844 | 1.2168 | 9.0341 | 11.7529 | 0.8665 | 0.8556 | 0.8606 |
2.4936 | 4.9601 | 11200 | 2.1547 | 9.8466 | 1.2359 | 9.085 | 12.0226 | 0.8659 | 0.8555 | 0.8604 |
2.4924 | 5.0487 | 11400 | 2.1449 | 9.5876 | 1.211 | 8.9224 | 12.1356 | 0.8653 | 0.8553 | 0.8599 |
2.473 | 5.1373 | 11600 | 2.1345 | 9.8693 | 1.2689 | 9.1436 | 12.2045 | 0.8654 | 0.8557 | 0.8602 |
2.4996 | 5.2259 | 11800 | 2.1260 | 9.8594 | 1.1983 | 9.1219 | 12.1236 | 0.8656 | 0.8557 | 0.8603 |
2.4804 | 5.3144 | 12000 | 2.1156 | 9.513 | 1.1218 | 8.7785 | 11.8843 | 0.8656 | 0.8548 | 0.8598 |
2.4735 | 5.4030 | 12200 | 2.1068 | 9.7441 | 1.138 | 9.0314 | 12.0857 | 0.8648 | 0.8548 | 0.8594 |
2.4607 | 5.4916 | 12400 | 2.0986 | 9.3751 | 1.1981 | 8.7405 | 11.8544 | 0.8647 | 0.8547 | 0.8593 |
2.4562 | 5.5802 | 12600 | 2.0936 | 10.0681 | 1.2645 | 9.2824 | 12.5899 | 0.864 | 0.8555 | 0.8594 |
2.4406 | 5.6687 | 12800 | 2.0837 | 9.8866 | 1.2947 | 9.1567 | 12.1136 | 0.8649 | 0.8554 | 0.8598 |
2.4489 | 5.7573 | 13000 | 2.0772 | 9.8192 | 1.1804 | 9.0906 | 11.7513 | 0.8657 | 0.8551 | 0.8601 |
2.4462 | 5.8459 | 13200 | 2.0692 | 9.6704 | 1.2094 | 8.9122 | 12.2471 | 0.8641 | 0.8553 | 0.8593 |
2.4406 | 5.9345 | 13400 | 2.0621 | 9.7162 | 1.3014 | 9.0299 | 12.0831 | 0.8635 | 0.8547 | 0.8587 |
2.4221 | 6.0230 | 13600 | 2.0561 | 9.9116 | 1.3926 | 9.209 | 12.0862 | 0.864 | 0.855 | 0.8591 |
2.435 | 6.1116 | 13800 | 2.0476 | 9.835 | 1.3613 | 9.0911 | 12.256 | 0.8641 | 0.8552 | 0.8593 |
2.4165 | 6.2002 | 14000 | 2.0421 | 9.6727 | 1.3095 | 9.0112 | 11.8948 | 0.8635 | 0.855 | 0.8589 |
2.3935 | 6.2888 | 14200 | 2.0343 | 9.9215 | 1.3953 | 9.2393 | 12.0473 | 0.8637 | 0.8551 | 0.859 |
2.4261 | 6.3773 | 14400 | 2.0297 | 9.8114 | 1.4524 | 9.0984 | 12.0021 | 0.8639 | 0.8552 | 0.8592 |
2.394 | 6.4659 | 14600 | 2.0240 | 9.5654 | 1.2787 | 8.8663 | 11.9732 | 0.8641 | 0.8549 | 0.8591 |
2.3933 | 6.5545 | 14800 | 2.0203 | 9.8739 | 1.3828 | 9.1743 | 12.0726 | 0.8641 | 0.8552 | 0.8593 |
2.4024 | 6.6430 | 15000 | 2.0159 | 9.609 | 1.2472 | 8.9373 | 11.9537 | 0.8643 | 0.8549 | 0.8592 |
2.3869 | 6.7316 | 15200 | 2.0106 | 9.9155 | 1.3333 | 9.1962 | 11.8244 | 0.8651 | 0.8553 | 0.8598 |
2.3739 | 6.8202 | 15400 | 2.0058 | 10.0558 | 1.5073 | 9.3033 | 12.0831 | 0.8633 | 0.8551 | 0.8588 |
2.3949 | 6.9088 | 15600 | 2.0029 | 9.8884 | 1.3915 | 9.1274 | 11.9543 | 0.8638 | 0.8551 | 0.8591 |
2.3669 | 6.9973 | 15800 | 1.9990 | 9.8894 | 1.4128 | 9.1332 | 11.959 | 0.8637 | 0.8552 | 0.8591 |
2.3601 | 7.0859 | 16000 | 1.9956 | 9.9601 | 1.4481 | 9.2314 | 12.0042 | 0.8637 | 0.8554 | 0.8592 |
2.3714 | 7.1745 | 16200 | 1.9924 | 9.7765 | 1.4063 | 9.0989 | 11.9522 | 0.8639 | 0.8551 | 0.8591 |
2.3808 | 7.2631 | 16400 | 1.9905 | 9.9078 | 1.395 | 9.1911 | 12.0599 | 0.8638 | 0.8553 | 0.8591 |
2.3462 | 7.3516 | 16600 | 1.9864 | 9.8945 | 1.4247 | 9.1194 | 12.0852 | 0.8637 | 0.8553 | 0.8591 |
2.3824 | 7.4402 | 16800 | 1.9853 | 9.7483 | 1.3538 | 9.0057 | 11.9395 | 0.864 | 0.8549 | 0.859 |
2.366 | 7.5288 | 17000 | 1.9841 | 9.8888 | 1.4348 | 9.1384 | 12.0179 | 0.8638 | 0.8552 | 0.8591 |
2.3618 | 7.6174 | 17200 | 1.9813 | 9.841 | 1.4329 | 9.0831 | 12.0216 | 0.8644 | 0.8551 | 0.8594 |
2.3614 | 7.7059 | 17400 | 1.9799 | 9.6973 | 1.3645 | 8.9982 | 11.9732 | 0.8643 | 0.8549 | 0.8592 |
2.3658 | 7.7945 | 17600 | 1.9795 | 9.8251 | 1.4344 | 9.1177 | 11.9826 | 0.8647 | 0.8551 | 0.8595 |
2.3609 | 7.8831 | 17800 | 1.9789 | 9.7443 | 1.3928 | 9.0298 | 11.9879 | 0.8644 | 0.855 | 0.8593 |
2.3384 | 7.9717 | 18000 | 1.9786 | 9.6963 | 1.3933 | 8.9954 | 12.0037 | 0.8643 | 0.855 | 0.8593 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for greatakela/flan-t5-small-gen-chat_v3
Base model
google/flan-t5-small