bengali_qa_model_AGGRO_V2

This model is a fine-tuned version of sagorsarker/bangla-bert-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1885
  • Exact Match: 96.0
  • F1 Score: 96.3051

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 3407
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 50

Training results

Training Loss Epoch Step Validation Loss Exact Match F1 Score
6.0684 0.0053 1 6.0629 0.0 6.2927
6.033 0.0107 2 5.9761 0.0 6.7935
6.0144 0.0160 3 5.8037 0.0 9.7900
5.8029 0.0214 4 5.5486 0.5263 19.3074
5.6831 0.0267 5 5.2126 2.2556 37.8180
5.26 0.0321 6 4.7970 5.9398 49.8764
4.8899 0.0374 7 4.3855 9.3233 55.2670
4.5683 0.0428 8 3.9798 15.3383 59.7750
4.0571 0.0481 9 3.5837 22.6316 63.9729
3.6658 0.0535 10 3.2052 28.7218 66.1381
3.3842 0.0588 11 2.8517 33.6842 68.2625
3.0377 0.0641 12 2.5296 38.3459 69.6544
2.933 0.0695 13 2.2425 42.1053 70.3538
2.383 0.0748 14 1.9875 45.9398 71.7662
2.12 0.0802 15 1.7636 50.1504 73.3768
1.7072 0.0855 16 1.5667 55.4887 75.4763
1.7314 0.0909 17 1.3929 59.8496 77.6552
1.4855 0.0962 18 1.2390 64.0602 80.1659
1.4605 0.1016 19 1.1030 68.2707 82.0848
1.4278 0.1069 20 0.9825 72.4060 84.1071
1.1391 0.1123 21 0.8741 76.1654 85.9345
1.2315 0.1176 22 0.7780 79.0977 87.2864
0.9215 0.1230 23 0.6933 81.6541 88.3887
0.7547 0.1283 24 0.6182 83.5338 89.2823
0.717 0.1336 25 0.5517 86.2406 90.9047
1.0054 0.1390 26 0.4950 88.1203 91.8787
0.5741 0.1443 27 0.4465 89.3233 92.5173
0.6248 0.1497 28 0.4053 90.3008 92.8381
0.4378 0.1550 29 0.3709 91.2782 93.3403
0.3546 0.1604 30 0.3421 92.2556 93.8510
0.542 0.1657 31 0.3188 92.8571 94.1842
0.2279 0.1711 32 0.2997 93.4586 94.3692
0.1765 0.1764 33 0.2843 93.8346 94.5317
0.256 0.1818 34 0.2721 94.2105 94.8025
0.2041 0.1871 35 0.2623 94.2857 94.8878
0.292 0.1924 36 0.2545 94.4361 94.9898
0.2241 0.1978 37 0.2484 94.7368 95.2014
0.5822 0.2031 38 0.2434 94.8120 95.3437
0.3077 0.2085 39 0.2394 94.9624 95.4440
0.3954 0.2138 40 0.2362 94.9624 95.4440
0.3814 0.2192 41 0.2337 94.9624 95.4440
0.1033 0.2245 42 0.2317 94.9624 95.4440

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.4.0
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
95
Safetensors
Model size
164M params
Tensor type
F32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Mediocre-Judge/bengali_qa_model_AGGRO_V2

Finetuned
(19)
this model