flan-t5-rouge-squad-qg-testd

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3172
  • Rouge1: 0.3825
  • Rouge2: 0.1273
  • Rougel: 0.3525
  • Rougelsum: 0.3638

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 80
  • eval_batch_size: 80
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 320
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 320

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
62.247 1.0 3 29.8148 0.0750 0.0208 0.0666 0.0670
51.173 2.0 6 23.8876 0.0668 0.0192 0.0606 0.0610
42.1769 3.0 9 18.4541 0.0526 0.0175 0.0497 0.0495
33.256 4.0 12 12.5653 0.0824 0.0407 0.0826 0.0828
25.1048 5.0 15 8.1533 0.0710 0.0374 0.0703 0.0704
19.8539 6.0 18 7.4179 0.0661 0.0342 0.0659 0.0659
16.7613 7.0 21 6.9241 0.0793 0.0345 0.0760 0.0773
14.5402 8.0 24 5.7677 0.0895 0.0402 0.0822 0.0849
12.564 9.0 27 4.8883 0.0996 0.0445 0.0851 0.0915
11.4386 10.0 30 4.7589 0.1057 0.0450 0.0938 0.0980
10.4913 11.0 33 4.5215 0.0988 0.0410 0.0899 0.0937
9.8507 12.0 36 4.3154 0.1055 0.0444 0.0937 0.0979
9.3569 13.0 39 4.1541 0.1331 0.0567 0.1163 0.1224
8.929 14.0 42 4.0030 0.1275 0.0564 0.1116 0.1173
8.5651 15.0 45 3.8319 0.1559 0.0641 0.1310 0.1396
8.1504 16.0 48 3.6324 0.1498 0.0522 0.1246 0.1324
7.8348 17.0 51 3.4408 0.1193 0.0437 0.1000 0.1064
7.4954 18.0 54 3.2865 0.1104 0.0453 0.0949 0.1011
7.1529 19.0 57 3.1646 0.1040 0.0405 0.0888 0.0949
6.8335 20.0 60 3.0580 0.1235 0.0442 0.0995 0.1060
6.5663 21.0 63 2.9471 0.1265 0.0404 0.1021 0.1096
6.365 22.0 66 2.8180 0.1125 0.0382 0.0929 0.1000
6.1181 23.0 69 2.6687 0.1233 0.0402 0.1035 0.1098
5.8384 24.0 72 2.5123 0.1178 0.0380 0.0961 0.1006
5.5936 25.0 75 2.3656 0.1202 0.0367 0.1024 0.1070
5.3808 26.0 78 2.2328 0.1247 0.0435 0.1071 0.1118
5.1353 27.0 81 2.1150 0.1488 0.0504 0.1311 0.1352
4.9245 28.0 84 2.0090 0.1846 0.0698 0.1601 0.1734
4.7075 29.0 87 1.9051 0.2689 0.0956 0.2361 0.2553
4.5006 30.0 90 1.7932 0.2864 0.0999 0.2538 0.2738
4.3009 31.0 93 1.6756 0.2951 0.1025 0.2651 0.2847
4.0895 32.0 96 1.5604 0.3277 0.1117 0.2942 0.3130
3.8926 33.0 99 1.4551 0.3300 0.1112 0.3031 0.3185
3.7328 34.0 102 1.3608 0.3483 0.1159 0.3172 0.3347
3.51 35.0 105 1.2774 0.3560 0.1184 0.3248 0.3422
3.3655 36.0 108 1.2001 0.3635 0.1213 0.3275 0.3504
3.2056 37.0 111 1.1217 0.3592 0.1193 0.3256 0.3456
3.0361 38.0 114 1.0424 0.3560 0.1173 0.3217 0.3426
2.8543 39.0 117 0.9668 0.3572 0.1171 0.3218 0.3437
2.7049 40.0 120 0.8994 0.3594 0.1181 0.3246 0.3469
2.6019 41.0 123 0.8394 0.3594 0.1181 0.3246 0.3469
2.4523 42.0 126 0.7840 0.3628 0.1179 0.3269 0.3508
2.3234 43.0 129 0.7356 0.3651 0.1186 0.3295 0.3530
2.2271 44.0 132 0.6886 0.3675 0.1170 0.3291 0.3533
2.0893 45.0 135 0.6449 0.3657 0.1166 0.3274 0.3511
2.0187 46.0 138 0.6045 0.3657 0.1166 0.3274 0.3511
1.914 47.0 141 0.5696 0.3657 0.1166 0.3274 0.3511
1.761 48.0 144 0.5373 0.3657 0.1168 0.3274 0.3511
1.7569 49.0 147 0.5078 0.3657 0.1160 0.3269 0.3517
1.617 50.0 150 0.4828 0.3658 0.1162 0.3274 0.3515
1.5393 51.0 153 0.4599 0.3655 0.1191 0.3297 0.3517
1.4776 52.0 156 0.4384 0.3682 0.1216 0.3335 0.3550
1.4251 53.0 159 0.4202 0.3711 0.1226 0.3357 0.3576
1.3735 54.0 162 0.4040 0.3750 0.1258 0.3385 0.3602
1.3235 55.0 165 0.3889 0.3733 0.1257 0.3375 0.3580
1.265 56.0 168 0.3757 0.3733 0.1257 0.3375 0.3580
1.1689 57.0 171 0.3634 0.3733 0.1257 0.3375 0.3580
1.1205 58.0 174 0.3526 0.3737 0.1259 0.3378 0.3583
1.0933 59.0 177 0.3433 0.3743 0.1282 0.3370 0.3598
1.1121 60.0 180 0.3350 0.3782 0.1292 0.3402 0.3635
1.0716 61.0 183 0.3277 0.3601 0.1188 0.3265 0.3455
0.9798 62.0 186 0.3213 0.3550 0.1163 0.3242 0.3411
0.952 63.0 189 0.3158 0.3591 0.1226 0.3283 0.3452
0.9912 64.0 192 0.3106 0.3609 0.1222 0.3345 0.3481
0.9127 65.0 195 0.3055 0.3564 0.1204 0.3294 0.3425
0.8917 66.0 198 0.3012 0.3560 0.1201 0.3296 0.3429
0.8884 67.0 201 0.2978 0.3529 0.1189 0.3258 0.3394
0.847 68.0 204 0.2949 0.3547 0.1208 0.3278 0.3414
0.8606 69.0 207 0.2922 0.3610 0.1204 0.3321 0.3446
0.8424 70.0 210 0.2896 0.3915 0.1375 0.3564 0.3738
0.7787 71.0 213 0.2872 0.3998 0.1419 0.3630 0.3848
0.7432 72.0 216 0.2852 0.3948 0.1382 0.3590 0.3810
0.7564 73.0 219 0.2833 0.3968 0.1380 0.3613 0.3833
0.754 74.0 222 0.2816 0.3993 0.1408 0.3644 0.3851
0.7192 75.0 225 0.2802 0.4072 0.1458 0.3654 0.3923
0.7213 76.0 228 0.2790 0.4043 0.1427 0.3659 0.3885
0.6866 77.0 231 0.2779 0.4038 0.1431 0.3668 0.3874
0.6775 78.0 234 0.2769 0.4038 0.1431 0.3668 0.3874
0.6183 79.0 237 0.2759 0.4043 0.1434 0.3690 0.3881
0.6822 80.0 240 0.2750 0.3983 0.1411 0.3630 0.3821
0.6479 81.0 243 0.2743 0.4041 0.1446 0.3661 0.3870
0.6156 82.0 246 0.2737 0.4041 0.1423 0.3647 0.3870
0.6385 83.0 249 0.2732 0.4051 0.1422 0.3653 0.3877
0.5933 84.0 252 0.2727 0.4041 0.1415 0.3647 0.3882
0.5804 85.0 255 0.2724 0.3986 0.1402 0.3604 0.3830
0.5972 86.0 258 0.2724 0.3986 0.1402 0.3604 0.3830
0.5974 87.0 261 0.2726 0.3992 0.1372 0.3593 0.3821
0.5638 88.0 264 0.2728 0.4019 0.1419 0.3624 0.3830
0.5944 89.0 267 0.2728 0.3988 0.1401 0.3600 0.3811
0.5376 90.0 270 0.2727 0.3992 0.1442 0.3636 0.3807
0.5403 91.0 273 0.2725 0.4021 0.1432 0.3677 0.3821
0.5554 92.0 276 0.2727 0.3902 0.1405 0.3566 0.3718
0.5088 93.0 279 0.2732 0.3857 0.1374 0.3545 0.3686
0.5104 94.0 282 0.2736 0.3871 0.1429 0.3553 0.3687
0.5169 95.0 285 0.2738 0.3896 0.1446 0.3573 0.3716
0.5073 96.0 288 0.2744 0.3940 0.1407 0.3585 0.3770
0.493 97.0 291 0.2750 0.3936 0.1401 0.3583 0.3765
0.5112 98.0 294 0.2756 0.3934 0.1400 0.3582 0.3763
0.4956 99.0 297 0.2760 0.3924 0.1400 0.3577 0.3755
0.451 100.0 300 0.2762 0.3946 0.1408 0.3605 0.3785
0.4518 101.0 303 0.2766 0.3931 0.1402 0.3584 0.3762
0.4978 102.0 306 0.2772 0.3930 0.1411 0.3589 0.3764
0.4707 103.0 309 0.2782 0.3899 0.1411 0.3589 0.3727
0.462 104.0 312 0.2793 0.3899 0.1411 0.3589 0.3727
0.4706 105.0 315 0.2799 0.3955 0.1367 0.3616 0.3777
0.4762 106.0 318 0.2807 0.3918 0.1343 0.3588 0.3741
0.4111 107.0 321 0.2811 0.3918 0.1343 0.3588 0.3741
0.417 108.0 324 0.2815 0.4029 0.1406 0.3685 0.3846
0.4255 109.0 327 0.2819 0.3983 0.1350 0.3617 0.3817
0.4114 110.0 330 0.2829 0.3977 0.1383 0.3594 0.3800
0.4327 111.0 333 0.2840 0.3971 0.1376 0.3588 0.3794
0.4261 112.0 336 0.2847 0.3892 0.1322 0.3505 0.3717
0.4185 113.0 339 0.2852 0.3792 0.1228 0.3406 0.3633
0.4145 114.0 342 0.2859 0.3794 0.1228 0.3411 0.3639
0.4198 115.0 345 0.2867 0.3757 0.1184 0.3373 0.3602
0.4012 116.0 348 0.2874 0.3806 0.1203 0.3407 0.3638
0.4371 117.0 351 0.2878 0.3780 0.1209 0.3376 0.3605
0.4001 118.0 354 0.2880 0.3779 0.1207 0.3375 0.3604
0.3914 119.0 357 0.2885 0.3835 0.1277 0.3435 0.3655
0.3985 120.0 360 0.2891 0.3739 0.1221 0.3381 0.3559
0.3902 121.0 363 0.2902 0.3825 0.1226 0.3431 0.3649
0.4109 122.0 366 0.2914 0.3825 0.1226 0.3431 0.3649
0.3785 123.0 369 0.2921 0.3790 0.1223 0.3393 0.3611
0.3985 124.0 372 0.2926 0.3790 0.1223 0.3393 0.3611
0.3709 125.0 375 0.2928 0.3768 0.1215 0.3376 0.3593
0.3844 126.0 378 0.2933 0.3769 0.1196 0.3366 0.3595
0.3716 127.0 381 0.2938 0.3769 0.1196 0.3366 0.3595
0.3907 128.0 384 0.2945 0.3789 0.1205 0.3380 0.3609
0.3565 129.0 387 0.2951 0.3811 0.1219 0.3388 0.3629
0.363 130.0 390 0.2959 0.3794 0.1211 0.3374 0.3616
0.3389 131.0 393 0.2967 0.3798 0.1234 0.3377 0.3611
0.3862 132.0 396 0.2975 0.3804 0.1234 0.3417 0.3609
0.3791 133.0 399 0.2982 0.3804 0.1234 0.3417 0.3609
0.3707 134.0 402 0.2986 0.3770 0.1210 0.3388 0.3573
0.3381 135.0 405 0.2989 0.3742 0.1192 0.3369 0.3556
0.3637 136.0 408 0.2994 0.3815 0.1270 0.3457 0.3616
0.3488 137.0 411 0.2999 0.3819 0.1268 0.3461 0.3620
0.3447 138.0 414 0.3003 0.3819 0.1268 0.3461 0.3620
0.3503 139.0 417 0.3005 0.3821 0.1270 0.3463 0.3622
0.3337 140.0 420 0.3008 0.3775 0.1244 0.3429 0.3577
0.3543 141.0 423 0.3013 0.3792 0.1254 0.3441 0.3590
0.3206 142.0 426 0.3017 0.3796 0.1254 0.3445 0.3593
0.3527 143.0 429 0.3021 0.3781 0.1257 0.3450 0.3590
0.3393 144.0 432 0.3026 0.3825 0.1313 0.3507 0.3635
0.3653 145.0 435 0.3032 0.3821 0.1304 0.3511 0.3634
0.3169 146.0 438 0.3039 0.3822 0.1296 0.3517 0.3637
0.3539 147.0 441 0.3042 0.3806 0.1295 0.3493 0.3618
0.3131 148.0 444 0.3047 0.3806 0.1295 0.3493 0.3618
0.3501 149.0 447 0.3053 0.3811 0.1295 0.3497 0.3622
0.3273 150.0 450 0.3058 0.3766 0.1248 0.3438 0.3578
0.3397 151.0 453 0.3061 0.3766 0.1248 0.3438 0.3578
0.3215 152.0 456 0.3062 0.3764 0.1239 0.3440 0.3577
0.3169 153.0 459 0.3064 0.3764 0.1239 0.3440 0.3577
0.3411 154.0 462 0.3067 0.3764 0.1239 0.3440 0.3577
0.3145 155.0 465 0.3069 0.3777 0.1251 0.3453 0.3588
0.3356 156.0 468 0.3072 0.3755 0.1240 0.3430 0.3565
0.3088 157.0 471 0.3073 0.3760 0.1241 0.3434 0.3569
0.3266 158.0 474 0.3077 0.3760 0.1241 0.3434 0.3569
0.3275 159.0 477 0.3082 0.3776 0.1237 0.3430 0.3573
0.3328 160.0 480 0.3086 0.3776 0.1237 0.3430 0.3573
0.3192 161.0 483 0.3090 0.3790 0.1246 0.3435 0.3576
0.3205 162.0 486 0.3094 0.3795 0.1246 0.3439 0.3581
0.3099 163.0 489 0.3098 0.3800 0.1248 0.3443 0.3586
0.3239 164.0 492 0.3101 0.3800 0.1248 0.3443 0.3586
0.2915 165.0 495 0.3104 0.3777 0.1260 0.3447 0.3584
0.3374 166.0 498 0.3107 0.3777 0.1260 0.3447 0.3584
0.3167 167.0 501 0.3113 0.3781 0.1242 0.3473 0.3596
0.3076 168.0 504 0.3116 0.3781 0.1242 0.3473 0.3596
0.3128 169.0 507 0.3118 0.3762 0.1240 0.3463 0.3579
0.3082 170.0 510 0.3120 0.3762 0.1228 0.3458 0.3583
0.3139 171.0 513 0.3120 0.3796 0.1231 0.3479 0.3600
0.3136 172.0 516 0.3119 0.3796 0.1231 0.3479 0.3600
0.3149 173.0 519 0.3119 0.3754 0.1219 0.3432 0.3562
0.3149 174.0 522 0.3121 0.3736 0.1203 0.3417 0.3550
0.3192 175.0 525 0.3121 0.3773 0.1215 0.3439 0.3578
0.3136 176.0 528 0.3123 0.3773 0.1215 0.3439 0.3578
0.3068 177.0 531 0.3125 0.3773 0.1215 0.3439 0.3578
0.3083 178.0 534 0.3129 0.3755 0.1206 0.3426 0.3567
0.3096 179.0 537 0.3133 0.3755 0.1206 0.3426 0.3567
0.3081 180.0 540 0.3137 0.3767 0.1253 0.3456 0.3571
0.3131 181.0 543 0.3140 0.3767 0.1253 0.3456 0.3571
0.2881 182.0 546 0.3142 0.3773 0.1255 0.3460 0.3576
0.2857 183.0 549 0.3143 0.3773 0.1255 0.3460 0.3576
0.2978 184.0 552 0.3145 0.3773 0.1255 0.3460 0.3576
0.2917 185.0 555 0.3146 0.3773 0.1255 0.3460 0.3576
0.3251 186.0 558 0.3147 0.3773 0.1255 0.3460 0.3576
0.3096 187.0 561 0.3147 0.3773 0.1255 0.3460 0.3576
0.2918 188.0 564 0.3149 0.3773 0.1255 0.3460 0.3576
0.2953 189.0 567 0.3152 0.3773 0.1255 0.3460 0.3576
0.2976 190.0 570 0.3154 0.3782 0.1217 0.3467 0.3593
0.3221 191.0 573 0.3154 0.3782 0.1217 0.3467 0.3593
0.285 192.0 576 0.3155 0.3760 0.1208 0.3430 0.3571
0.3 193.0 579 0.3156 0.3760 0.1208 0.3430 0.3571
0.2962 194.0 582 0.3159 0.3760 0.1208 0.3430 0.3571
0.3068 195.0 585 0.3161 0.3760 0.1208 0.3430 0.3571
0.3138 196.0 588 0.3162 0.3760 0.1208 0.3430 0.3571
0.305 197.0 591 0.3163 0.3760 0.1208 0.3430 0.3571
0.3013 198.0 594 0.3164 0.3760 0.1208 0.3430 0.3571
0.2877 199.0 597 0.3166 0.3760 0.1208 0.3430 0.3571
0.2951 200.0 600 0.3167 0.3760 0.1208 0.3430 0.3571
0.316 201.0 603 0.3169 0.3762 0.1210 0.3432 0.3573
0.2974 202.0 606 0.3170 0.3783 0.1218 0.3469 0.3595
0.2773 203.0 609 0.3171 0.3783 0.1218 0.3469 0.3595
0.2948 204.0 612 0.3171 0.3783 0.1218 0.3469 0.3595
0.3053 205.0 615 0.3171 0.3783 0.1218 0.3469 0.3595
0.2965 206.0 618 0.3172 0.3783 0.1218 0.3469 0.3595
0.2936 207.0 621 0.3172 0.3825 0.1273 0.3525 0.3638
0.2952 208.0 624 0.3172 0.3825 0.1273 0.3525 0.3638
0.293 209.0 627 0.3172 0.3825 0.1273 0.3525 0.3638
0.2994 210.0 630 0.3172 0.3825 0.1273 0.3525 0.3638
0.3017 211.0 633 0.3172 0.3825 0.1273 0.3525 0.3638
0.3058 212.0 636 0.3172 0.3825 0.1273 0.3525 0.3638
0.2976 213.0 639 0.3172 0.3825 0.1273 0.3525 0.3638
0.5935 213.4 640 0.3172 0.3825 0.1273 0.3525 0.3638

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for devagonal/flan-t5-rouge-squad-qg-testd

Finetuned
(313)
this model