Visualize in Weights & Biases

flan-t5-small-gen-chat_v3

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9786
  • Rouge 1: 9.6963
  • Rouge 2: 1.3933
  • Rouge L: 8.9954
  • Avg Len: 12.0037
  • Bertscore Prec: 0.8643
  • Bertscore Rec: 0.855
  • Bertscore F1: 0.8593

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge 1 Rouge 2 Rouge L Avg Len Bertscore Prec Bertscore Rec Bertscore F1
4.1849 0.0886 200 3.8165 5.8097 0.3202 5.3973 11.8954 0.8348 0.8322 0.8332
3.9771 0.1771 400 3.6720 6.0081 0.3853 5.6136 13.2256 0.8375 0.8334 0.8352
3.8582 0.2657 600 3.5856 5.6377 0.3018 5.3033 12.7928 0.8417 0.8362 0.8385
3.7999 0.3543 800 3.5083 5.5186 0.2521 5.1755 13.0536 0.8446 0.8406 0.8422
3.7116 0.4429 1000 3.4452 4.8599 0.2578 4.6238 12.0016 0.851 0.8433 0.8467
3.6357 0.5314 1200 3.3819 5.4195 0.2351 5.0996 12.1052 0.8559 0.8465 0.8508
3.589 0.6200 1400 3.3247 5.6178 0.3175 5.3152 12.0741 0.8597 0.8475 0.8532
3.5486 0.7086 1600 3.2654 5.9822 0.306 5.7237 13.2056 0.8597 0.8481 0.8534
3.4733 0.7972 1800 3.2081 6.5603 0.3579 6.1843 12.4201 0.8588 0.8485 0.8532
3.4067 0.8857 2000 3.1521 6.971 0.4266 6.5181 12.4206 0.8615 0.8509 0.8558
3.3822 0.9743 2200 3.1045 6.4559 0.4137 6.1487 13.4043 0.8629 0.8506 0.8563
3.3336 1.0629 2400 3.0578 7.2074 0.4162 6.7473 13.5994 0.8614 0.8521 0.8564
3.2777 1.1515 2600 3.0174 7.3221 0.42 6.8584 13.6125 0.8625 0.8524 0.857
3.2434 1.2400 2800 2.9788 6.6126 0.473 6.2565 13.4753 0.8612 0.851 0.8556
3.19 1.3286 3000 2.9445 7.3159 0.4829 6.8595 13.244 0.8637 0.8527 0.8578
3.1548 1.4172 3200 2.9119 7.5114 0.5239 6.9946 12.8333 0.867 0.8523 0.8593
3.1495 1.5058 3400 2.8816 7.6945 0.5481 7.205 12.9038 0.8656 0.8527 0.8587
3.1304 1.5943 3600 2.8490 7.5919 0.5784 7.1099 12.6851 0.8661 0.8532 0.8593
3.1161 1.6829 3800 2.8193 7.5826 0.606 7.0534 13.2713 0.8632 0.8526 0.8575
3.0728 1.7715 4000 2.7908 7.9963 0.6485 7.4816 13.1178 0.8633 0.854 0.8583
3.0435 1.8601 4200 2.7619 7.7297 0.6519 7.1663 13.5631 0.8624 0.8537 0.8577
3.0435 1.9486 4400 2.7386 8.1558 0.6462 7.5864 12.4606 0.8652 0.8538 0.8591
2.9904 2.0372 4600 2.7115 8.2389 0.6351 7.6214 13.1546 0.8637 0.8541 0.8585
2.9561 2.1258 4800 2.6842 8.0952 0.6869 7.4859 13.1651 0.865 0.8546 0.8594
2.9581 2.2143 5000 2.6625 8.3428 0.6576 7.7467 12.1782 0.865 0.8535 0.8589
2.9205 2.3029 5200 2.6386 8.4692 0.6628 7.8614 12.8927 0.8661 0.8541 0.8597
2.8909 2.3915 5400 2.6168 8.9372 0.7864 8.2121 13.0857 0.8641 0.8551 0.8592
2.8943 2.4801 5600 2.5921 8.6149 0.7761 7.9657 13.0762 0.8641 0.8548 0.8591
2.8807 2.5686 5800 2.5723 8.7656 0.8198 8.115 13.1309 0.8655 0.855 0.8599
2.8698 2.6572 6000 2.5483 8.7496 0.7512 8.1781 12.4621 0.8636 0.8546 0.8587
2.8375 2.7458 6200 2.5314 8.3957 0.811 7.8299 12.3601 0.864 0.8539 0.8586
2.8169 2.8344 6400 2.5110 8.4644 0.8284 7.897 12.4501 0.8644 0.8543 0.859
2.8047 2.9229 6600 2.4917 9.0923 0.8458 8.417 12.9096 0.8652 0.8551 0.8598
2.7935 3.0115 6800 2.4725 8.8998 0.9062 8.2578 13.0 0.8639 0.8551 0.8592
2.7408 3.1001 7000 2.4529 8.8207 0.8174 8.2216 12.2455 0.8653 0.8545 0.8595
2.7551 3.1887 7200 2.4364 8.8831 0.8326 8.2174 12.6246 0.8638 0.8543 0.8587
2.7104 3.2772 7400 2.4171 9.1268 0.935 8.4039 12.2392 0.8635 0.8545 0.8586
2.7175 3.3658 7600 2.4051 8.8107 0.8979 8.2077 12.1488 0.8651 0.8548 0.8596
2.7037 3.4544 7800 2.3833 8.4842 0.8216 7.9163 12.3849 0.8655 0.854 0.8594
2.6905 3.5430 8000 2.3658 8.6487 0.9095 8.0252 12.4101 0.864 0.854 0.8586
2.6822 3.6315 8200 2.3522 8.4983 0.8329 7.943 12.9374 0.865 0.8541 0.8592
2.6711 3.7201 8400 2.3368 8.8067 0.9028 8.2338 12.979 0.8643 0.8543 0.859
2.6793 3.8087 8600 2.3195 8.8736 0.9553 8.2529 12.5521 0.864 0.8544 0.8588
2.6567 3.8973 8800 2.3052 9.3126 1.058 8.5879 12.4516 0.8651 0.8547 0.8595
2.6376 3.9858 9000 2.2933 9.2685 1.1461 8.6109 11.9085 0.8665 0.8551 0.8604
2.6161 4.0744 9200 2.2750 9.131 1.0839 8.4462 11.9385 0.8653 0.8553 0.8599
2.6107 4.1630 9400 2.2635 8.8956 1.0494 8.2891 11.817 0.8664 0.8547 0.8601
2.5926 4.2516 9600 2.2494 9.1648 1.006 8.5539 12.0084 0.8652 0.8545 0.8594
2.5863 4.3401 9800 2.2380 9.3821 1.1255 8.8075 11.6199 0.8677 0.8548 0.8609
2.5654 4.4287 10000 2.2242 9.1769 1.0914 8.5615 12.0484 0.8647 0.8547 0.8593
2.5592 4.5173 10200 2.2109 9.5718 1.2002 8.8977 12.091 0.8666 0.8551 0.8605
2.5595 4.6058 10400 2.1977 9.6277 1.1425 8.928 11.8028 0.8664 0.8546 0.8601
2.5367 4.6944 10600 2.1860 9.9977 1.2729 9.2575 11.3985 0.8661 0.8549 0.8601
2.5426 4.7830 10800 2.1752 9.9377 1.1987 9.165 11.8912 0.8657 0.8553 0.8601
2.5285 4.8716 11000 2.1652 9.7844 1.2168 9.0341 11.7529 0.8665 0.8556 0.8606
2.4936 4.9601 11200 2.1547 9.8466 1.2359 9.085 12.0226 0.8659 0.8555 0.8604
2.4924 5.0487 11400 2.1449 9.5876 1.211 8.9224 12.1356 0.8653 0.8553 0.8599
2.473 5.1373 11600 2.1345 9.8693 1.2689 9.1436 12.2045 0.8654 0.8557 0.8602
2.4996 5.2259 11800 2.1260 9.8594 1.1983 9.1219 12.1236 0.8656 0.8557 0.8603
2.4804 5.3144 12000 2.1156 9.513 1.1218 8.7785 11.8843 0.8656 0.8548 0.8598
2.4735 5.4030 12200 2.1068 9.7441 1.138 9.0314 12.0857 0.8648 0.8548 0.8594
2.4607 5.4916 12400 2.0986 9.3751 1.1981 8.7405 11.8544 0.8647 0.8547 0.8593
2.4562 5.5802 12600 2.0936 10.0681 1.2645 9.2824 12.5899 0.864 0.8555 0.8594
2.4406 5.6687 12800 2.0837 9.8866 1.2947 9.1567 12.1136 0.8649 0.8554 0.8598
2.4489 5.7573 13000 2.0772 9.8192 1.1804 9.0906 11.7513 0.8657 0.8551 0.8601
2.4462 5.8459 13200 2.0692 9.6704 1.2094 8.9122 12.2471 0.8641 0.8553 0.8593
2.4406 5.9345 13400 2.0621 9.7162 1.3014 9.0299 12.0831 0.8635 0.8547 0.8587
2.4221 6.0230 13600 2.0561 9.9116 1.3926 9.209 12.0862 0.864 0.855 0.8591
2.435 6.1116 13800 2.0476 9.835 1.3613 9.0911 12.256 0.8641 0.8552 0.8593
2.4165 6.2002 14000 2.0421 9.6727 1.3095 9.0112 11.8948 0.8635 0.855 0.8589
2.3935 6.2888 14200 2.0343 9.9215 1.3953 9.2393 12.0473 0.8637 0.8551 0.859
2.4261 6.3773 14400 2.0297 9.8114 1.4524 9.0984 12.0021 0.8639 0.8552 0.8592
2.394 6.4659 14600 2.0240 9.5654 1.2787 8.8663 11.9732 0.8641 0.8549 0.8591
2.3933 6.5545 14800 2.0203 9.8739 1.3828 9.1743 12.0726 0.8641 0.8552 0.8593
2.4024 6.6430 15000 2.0159 9.609 1.2472 8.9373 11.9537 0.8643 0.8549 0.8592
2.3869 6.7316 15200 2.0106 9.9155 1.3333 9.1962 11.8244 0.8651 0.8553 0.8598
2.3739 6.8202 15400 2.0058 10.0558 1.5073 9.3033 12.0831 0.8633 0.8551 0.8588
2.3949 6.9088 15600 2.0029 9.8884 1.3915 9.1274 11.9543 0.8638 0.8551 0.8591
2.3669 6.9973 15800 1.9990 9.8894 1.4128 9.1332 11.959 0.8637 0.8552 0.8591
2.3601 7.0859 16000 1.9956 9.9601 1.4481 9.2314 12.0042 0.8637 0.8554 0.8592
2.3714 7.1745 16200 1.9924 9.7765 1.4063 9.0989 11.9522 0.8639 0.8551 0.8591
2.3808 7.2631 16400 1.9905 9.9078 1.395 9.1911 12.0599 0.8638 0.8553 0.8591
2.3462 7.3516 16600 1.9864 9.8945 1.4247 9.1194 12.0852 0.8637 0.8553 0.8591
2.3824 7.4402 16800 1.9853 9.7483 1.3538 9.0057 11.9395 0.864 0.8549 0.859
2.366 7.5288 17000 1.9841 9.8888 1.4348 9.1384 12.0179 0.8638 0.8552 0.8591
2.3618 7.6174 17200 1.9813 9.841 1.4329 9.0831 12.0216 0.8644 0.8551 0.8594
2.3614 7.7059 17400 1.9799 9.6973 1.3645 8.9982 11.9732 0.8643 0.8549 0.8592
2.3658 7.7945 17600 1.9795 9.8251 1.4344 9.1177 11.9826 0.8647 0.8551 0.8595
2.3609 7.8831 17800 1.9789 9.7443 1.3928 9.0298 11.9879 0.8644 0.855 0.8593
2.3384 7.9717 18000 1.9786 9.6963 1.3933 8.9954 12.0037 0.8643 0.855 0.8593

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
12
Safetensors
Model size
77M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for greatakela/flan-t5-small-gen-chat_v3

Finetuned
(358)
this model