bengali_qa_model_AGGRO_bert_base_uncased

This model is a fine-tuned version of bert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1386
  • Exact Match: 95.2857
  • F1 Score: 96.3846

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 3407
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 150

Training results

Training Loss Epoch Step Validation Loss Exact Match F1 Score
6.2014 0.0053 1 6.1979 0.0 7.0506
6.1881 0.0107 2 6.1825 0.0 7.3879
6.2139 0.0160 3 6.1522 0.0 8.2521
6.1951 0.0214 4 6.1075 0.0 10.5119
6.1083 0.0267 5 6.0489 0.4511 15.3179
6.0835 0.0321 6 5.9767 2.1053 24.1412
6.0129 0.0374 7 5.8920 3.6842 32.7230
5.9993 0.0428 8 5.7951 6.5414 40.7170
5.8684 0.0481 9 5.6866 9.9248 46.7951
5.7231 0.0535 10 5.5651 13.8346 51.5365
5.6534 0.0588 11 5.4313 18.4211 55.1575
5.5657 0.0641 12 5.2856 23.2331 58.0711
5.5148 0.0695 13 5.1299 26.6165 60.2489
5.3065 0.0748 14 4.9627 29.7744 62.0035
5.1358 0.0802 15 4.7850 33.7594 64.1128
4.9002 0.0855 16 4.5951 36.9173 66.1340
4.8232 0.0909 17 4.4087 39.7744 67.8590
4.604 0.0962 18 4.2247 43.1579 69.4792
4.5291 0.1016 19 4.0460 47.1429 71.8867
4.3711 0.1069 20 3.8705 50.2256 73.7469
4.2404 0.1123 21 3.6984 52.6316 74.5106
4.1769 0.1176 22 3.5303 54.9624 75.5802
3.8464 0.1230 23 3.3656 56.4662 76.4567
3.8178 0.1283 24 3.2085 57.9699 77.2230
3.6047 0.1336 25 3.0549 60.3008 78.0660
3.4466 0.1390 26 2.9061 61.8797 78.4774
3.3154 0.1443 27 2.7639 63.8346 79.6681
3.3505 0.1497 28 2.6242 66.1654 80.4722
3.0315 0.1550 29 2.4883 67.3684 81.0899
2.8796 0.1604 30 2.3571 68.9474 82.0043
2.9183 0.1657 31 2.2325 70.8271 82.8722
2.4212 0.1711 32 2.1128 71.7293 83.2822
2.283 0.1764 33 1.9975 72.2556 83.7327
2.2454 0.1818 34 1.8851 73.0075 84.1584
2.1467 0.1871 35 1.7746 74.2105 84.9934
2.1079 0.1924 36 1.6643 76.0150 86.0725
1.7214 0.1978 37 1.5538 77.6692 87.2392
2.0057 0.2031 38 1.4412 78.1203 87.5816
1.8565 0.2085 39 1.3280 79.5489 88.3833
1.8383 0.2138 40 1.2168 80.4511 89.0872
1.6629 0.2192 41 1.1118 81.0526 89.6248
1.3915 0.2245 42 1.0174 81.6541 89.5453
1.3512 0.2299 43 0.9330 82.3308 89.6999
1.1958 0.2352 44 0.8608 82.4812 89.8430
1.0808 0.2406 45 0.8000 82.7820 89.7950
1.0992 0.2459 46 0.7469 83.0075 89.8987
0.8846 0.2513 47 0.6994 83.3083 90.0405
0.9159 0.2566 48 0.6537 84.5113 90.4492
0.7867 0.2619 49 0.6108 85.1128 90.7277
0.849 0.2673 50 0.5724 86.2406 90.9660
0.7173 0.2726 51 0.5379 86.7669 91.2355
0.8123 0.2780 52 0.5058 87.3684 91.5345
0.6065 0.2833 53 0.4770 87.3684 91.4719
0.5135 0.2887 54 0.4495 87.6692 91.4269
0.809 0.2940 55 0.4236 88.3459 91.6852
0.5281 0.2994 56 0.3990 88.6466 91.8734
0.5029 0.3047 57 0.3780 89.2481 92.0989
0.5069 0.3101 58 0.3593 89.6241 92.2735
0.4163 0.3154 59 0.3425 90.1504 92.6192
0.4271 0.3207 60 0.3275 90.8271 92.8536
0.385 0.3261 61 0.3087 91.3534 93.0501
0.365 0.3314 62 0.2891 91.5038 93.1793
0.3785 0.3368 63 0.2729 91.8797 93.3364
0.1751 0.3421 64 0.2598 92.2556 93.5858
0.3697 0.3475 65 0.2499 92.7068 93.9267
0.4384 0.3528 66 0.2413 92.9323 94.1523
0.2969 0.3582 67 0.2327 93.0075 94.2276
0.3493 0.3635 68 0.2251 93.3083 94.4743
0.1457 0.3689 69 0.2182 93.4586 94.5879
0.6827 0.3742 70 0.2096 93.6842 94.7584
0.4472 0.3796 71 0.2018 93.7594 94.8085
0.1831 0.3849 72 0.1945 93.6842 94.6772
0.2381 0.3902 73 0.1891 93.7594 94.6731
0.1892 0.3956 74 0.1830 93.8346 94.7656

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.4.0
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
1
Safetensors
Model size
109M params
Tensor type
F32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Mediocre-Judge/bengali_qa_model_AGGRO_bert_base_uncased

Finetuned
(2386)
this model