flan-t5-rouge-squad-qg-testc

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3164
  • Rouge1: 0.3601
  • Rouge2: 0.1205
  • Rougel: 0.3353
  • Rougelsum: 0.3469

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 80
  • eval_batch_size: 80
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 320
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 160

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
62.2537 1.0 3 29.8253 0.0752 0.0204 0.0667 0.0672
51.2172 2.0 6 23.9235 0.0670 0.0190 0.0607 0.0609
42.2884 3.0 9 18.5434 0.0527 0.0175 0.0494 0.0496
33.4669 4.0 12 12.7318 0.0825 0.0407 0.0824 0.0827
25.4026 5.0 15 8.2492 0.0706 0.0377 0.0701 0.0704
20.0834 6.0 18 7.4425 0.0662 0.0349 0.0664 0.0666
16.9821 7.0 21 6.9817 0.0789 0.0347 0.0759 0.0776
14.7652 8.0 24 5.9311 0.0961 0.0435 0.0905 0.0937
12.7754 9.0 27 4.9313 0.1005 0.0444 0.0841 0.0916
11.6278 10.0 30 4.7801 0.1311 0.0577 0.1158 0.1214
10.6391 11.0 33 4.5698 0.0978 0.0372 0.0872 0.0905
9.9925 12.0 36 4.3618 0.1011 0.0440 0.0892 0.0934
9.4969 13.0 39 4.2002 0.1331 0.0582 0.1167 0.1221
9.0639 14.0 42 4.0571 0.1327 0.0564 0.1165 0.1217
8.7064 15.0 45 3.9041 0.1484 0.0628 0.1267 0.1337
8.3122 16.0 48 3.7252 0.1528 0.0543 0.1264 0.1345
8.0191 17.0 51 3.5352 0.1356 0.0502 0.1129 0.1204
7.7028 18.0 54 3.3741 0.1175 0.0426 0.0978 0.1045
7.3704 19.0 57 3.2478 0.1111 0.0460 0.0947 0.1004
7.059 20.0 60 3.1444 0.1052 0.0359 0.0842 0.0900
6.798 21.0 63 3.0483 0.1240 0.0444 0.0990 0.1057
6.6172 22.0 66 2.9450 0.1281 0.0416 0.1018 0.1100
6.397 23.0 69 2.8270 0.1294 0.0426 0.1058 0.1154
6.1434 24.0 72 2.6957 0.1192 0.0405 0.1011 0.1072
5.9183 25.0 75 2.5599 0.1207 0.0367 0.0974 0.1027
5.7236 26.0 78 2.4320 0.1124 0.0342 0.0950 0.1001
5.5052 27.0 81 2.3171 0.1121 0.0367 0.0961 0.1004
5.3234 28.0 84 2.2137 0.1322 0.0453 0.1132 0.1197
5.1292 29.0 87 2.1201 0.1464 0.0526 0.1248 0.1322
4.9497 30.0 90 2.0297 0.2406 0.0923 0.2102 0.2278
4.7775 31.0 93 1.9376 0.2581 0.0958 0.2229 0.2445
4.5872 32.0 96 1.8415 0.2783 0.0982 0.2442 0.2654
4.4228 33.0 99 1.7448 0.2993 0.1034 0.2699 0.2878
4.2818 34.0 102 1.6519 0.3106 0.1082 0.2810 0.2983
4.0818 35.0 105 1.5682 0.3280 0.1101 0.2963 0.3144
3.9575 36.0 108 1.4935 0.3307 0.1099 0.2992 0.3180
3.8176 37.0 111 1.4235 0.3464 0.1150 0.3140 0.3318
3.666 38.0 114 1.3551 0.3496 0.1166 0.3173 0.3349
3.5058 39.0 117 1.2865 0.3496 0.1166 0.3173 0.3349
3.3658 40.0 120 1.2200 0.3475 0.1166 0.3155 0.3334
3.2795 41.0 123 1.1562 0.3522 0.1179 0.3226 0.3386
3.1434 42.0 126 1.0954 0.3522 0.1179 0.3226 0.3386
3.0247 43.0 129 1.0422 0.3522 0.1179 0.3226 0.3386
2.9343 44.0 132 0.9925 0.3529 0.1185 0.3238 0.3393
2.8065 45.0 135 0.9465 0.3529 0.1185 0.3238 0.3393
2.7406 46.0 138 0.9023 0.3529 0.1185 0.3238 0.3393
2.6367 47.0 141 0.8608 0.3551 0.1166 0.3249 0.3427
2.4855 48.0 144 0.8197 0.3592 0.1172 0.3284 0.3457
2.4782 49.0 147 0.7803 0.3592 0.1172 0.3284 0.3457
2.3351 50.0 150 0.7463 0.3586 0.1149 0.3253 0.3446
2.2519 51.0 153 0.7154 0.3655 0.1169 0.3281 0.3517
2.1864 52.0 156 0.6861 0.3676 0.1171 0.3293 0.3530
2.128 53.0 159 0.6597 0.3676 0.1171 0.3293 0.3530
2.0668 54.0 162 0.6343 0.3676 0.1171 0.3293 0.3530
2.013 55.0 165 0.6098 0.3658 0.1167 0.3277 0.3506
1.9364 56.0 168 0.5872 0.3658 0.1167 0.3277 0.3506
1.8327 57.0 171 0.5655 0.3658 0.1167 0.3277 0.3506
1.7749 58.0 174 0.5456 0.3659 0.1169 0.3282 0.3506
1.7399 59.0 177 0.5276 0.3659 0.1169 0.3282 0.3506
1.7449 60.0 180 0.5110 0.3659 0.1169 0.3282 0.3506
1.6973 61.0 183 0.4959 0.3659 0.1169 0.3282 0.3506
1.5943 62.0 186 0.4822 0.3658 0.1185 0.3280 0.3506
1.5571 63.0 189 0.4703 0.3666 0.1194 0.3308 0.3519
1.5806 64.0 192 0.4589 0.3666 0.1194 0.3308 0.3519
1.5002 65.0 195 0.4471 0.3666 0.1194 0.3308 0.3519
1.4634 66.0 198 0.4356 0.3666 0.1194 0.3308 0.3519
1.4553 67.0 201 0.4250 0.3697 0.1216 0.3344 0.3559
1.4035 68.0 204 0.4164 0.3701 0.1222 0.3350 0.3561
1.4084 69.0 207 0.4090 0.3739 0.1255 0.3378 0.3594
1.3806 70.0 210 0.4023 0.3739 0.1255 0.3378 0.3594
1.3048 71.0 213 0.3957 0.3728 0.1254 0.3373 0.3576
1.2709 72.0 216 0.3891 0.3728 0.1254 0.3373 0.3576
1.2735 73.0 219 0.3828 0.3728 0.1254 0.3373 0.3576
1.2733 74.0 222 0.3768 0.3728 0.1254 0.3373 0.3576
1.2215 75.0 225 0.3715 0.3728 0.1254 0.3373 0.3576
1.2225 76.0 228 0.3669 0.3732 0.1255 0.3374 0.3580
1.1829 77.0 231 0.3628 0.3732 0.1255 0.3374 0.3580
1.162 78.0 234 0.3591 0.3722 0.1244 0.3362 0.3570
1.097 79.0 237 0.3556 0.3715 0.1236 0.3355 0.3564
1.1702 80.0 240 0.3519 0.3715 0.1236 0.3355 0.3564
1.1309 81.0 243 0.3483 0.3764 0.1259 0.3400 0.3608
1.0986 82.0 246 0.3451 0.3563 0.1178 0.3261 0.3428
1.1109 83.0 249 0.3422 0.3563 0.1178 0.3261 0.3428
1.0752 84.0 252 0.3397 0.3553 0.1167 0.3257 0.3417
1.0475 85.0 255 0.3374 0.3553 0.1167 0.3257 0.3417
1.0736 86.0 258 0.3353 0.3553 0.1167 0.3257 0.3417
1.0723 87.0 261 0.3333 0.3552 0.1176 0.3249 0.3409
1.0326 88.0 264 0.3314 0.3536 0.1165 0.3236 0.3394
1.0742 89.0 267 0.3297 0.3536 0.1165 0.3236 0.3394
1.0081 90.0 270 0.3281 0.3583 0.1172 0.3282 0.3447
1.0158 91.0 273 0.3266 0.3583 0.1172 0.3282 0.3447
1.032 92.0 276 0.3252 0.3632 0.1213 0.3330 0.3497
0.9778 93.0 279 0.3239 0.3632 0.1213 0.3330 0.3497
0.9834 94.0 282 0.3228 0.3624 0.1221 0.3323 0.3492
0.9913 95.0 285 0.3218 0.3624 0.1221 0.3323 0.3492
0.9963 96.0 288 0.3210 0.3624 0.1221 0.3323 0.3492
0.9759 97.0 291 0.3202 0.3605 0.1208 0.3344 0.3465
1.0011 98.0 294 0.3195 0.3605 0.1208 0.3344 0.3465
0.9895 99.0 297 0.3189 0.3605 0.1208 0.3344 0.3465
0.926 100.0 300 0.3183 0.3605 0.1208 0.3344 0.3465
0.9347 101.0 303 0.3178 0.3605 0.1208 0.3344 0.3465
1.0039 102.0 306 0.3173 0.3605 0.1208 0.3344 0.3465
0.9693 103.0 309 0.3170 0.3603 0.1205 0.3353 0.3469
0.9754 104.0 312 0.3167 0.3601 0.1205 0.3353 0.3469
0.9872 105.0 315 0.3165 0.3601 0.1205 0.3353 0.3469
1.0003 106.0 318 0.3164 0.3601 0.1205 0.3353 0.3469
1.9782 106.8 320 0.3164 0.3601 0.1205 0.3353 0.3469

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
1
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for devagonal/flan-t5-rouge-squad-qg-testc

Finetuned
(313)
this model