Bn_GEDC

This model is a fine-tuned version of facebook/mbart-large-50 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0461
  • Wer: 0.07
  • Bleu: 0.847

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • num_epochs: 2

Training results

Training Loss Epoch Step Bleu Validation Loss Wer
0.461 0.0245 2000 0.604 0.0894 0.185
0.0683 0.0490 4000 0.683 0.0677 0.144
0.052 0.0735 6000 0.71 0.0621 0.134
0.0427 0.0980 8000 0.732 0.0572 0.121
0.0373 0.1225 10000 0.749 0.0531 0.113
0.0335 0.1470 12000 0.759 0.0514 0.108
0.0397 0.1715 14000 0.77 0.0506 0.103
0.029 0.1960 16000 0.772 0.0508 0.103
0.0277 0.2205 18000 0.779 0.0496 0.099
0.0284 0.2450 20000 0.785 0.0468 0.096
0.0249 0.2695 22000 0.785 0.0479 0.097
0.0239 0.2940 24000 0.787 0.0481 0.095
0.0229 0.3185 26000 0.791 0.0473 0.094
0.0223 0.3430 28000 0.795 0.0461 0.092
0.0216 0.3674 30000 0.798 0.0471 0.091
0.0209 0.3919 32000 0.798 0.0467 0.091
0.0203 0.4164 34000 0.802 0.0464 0.089
0.0202 0.4409 36000 0.806 0.0454 0.087
0.0194 0.4654 38000 0.806 0.0462 0.087
0.0187 0.4899 40000 0.806 0.0471 0.087
0.0184 0.5144 42000 0.809 0.0462 0.086
0.0179 0.5389 44000 0.811 0.0444 0.085
0.0176 0.5634 46000 0.812 0.0460 0.085
0.0174 0.5879 48000 0.811 0.0469 0.086
0.0171 0.6124 50000 0.813 0.0465 0.084
0.0166 0.6369 52000 0.816 0.0446 0.083
0.016 0.6614 54000 0.816 0.0461 0.083
0.0162 0.6859 56000 0.818 0.0451 0.082
0.0158 0.7104 58000 0.819 0.0449 0.082
0.0156 0.7349 60000 0.818 0.0454 0.082
0.0157 0.7594 62000 0.82 0.0455 0.082
0.015 0.7839 64000 0.822 0.0455 0.081
0.0148 0.8084 66000 0.822 0.0461 0.081
0.0146 0.8329 68000 0.823 0.0460 0.08
0.0145 0.8574 70000 0.824 0.0446 0.08
0.0144 0.8819 72000 0.824 0.0450 0.079
0.0141 0.9064 74000 0.822 0.0477 0.081
0.0139 0.9309 76000 0.826 0.0446 0.079
0.0137 0.9554 78000 0.827 0.0452 0.078
0.0136 0.9799 80000 0.827 0.0455 0.078
0.0128 1.0044 82000 0.829 0.0462 0.078
0.0104 1.0289 84000 0.829 0.0456 0.077
0.0105 1.0534 86000 0.829 0.0465 0.078
0.0103 1.0779 88000 0.831 0.0443 0.077
0.01 1.1023 90000 0.829 0.0456 0.077
0.0103 1.1268 92000 0.83 0.0466 0.077
0.0101 1.1513 94000 0.832 0.0462 0.076
0.01 1.1758 96000 0.832 0.0458 0.076
0.01 1.2003 98000 0.834 0.0460 0.075
0.0098 1.2248 100000 0.834 0.0464 0.076
0.0098 1.2493 102000 0.834 0.0455 0.075
0.0096 1.2738 104000 0.836 0.0453 0.075
0.0099 1.2983 106000 0.835 0.0469 0.075
0.0095 1.3228 108000 0.836 0.0466 0.075
0.0094 1.3473 110000 0.836 0.0461 0.075
0.0094 1.3718 112000 0.837 0.0465 0.074
0.0093 1.3963 114000 0.838 0.0469 0.074
0.0092 1.4208 116000 0.838 0.0469 0.074
0.0092 1.4453 118000 0.838 0.0476 0.074
0.0092 1.4698 120000 0.839 0.0466 0.074
0.0091 1.4943 122000 0.841 0.0462 0.072
0.0089 1.5188 124000 0.839 0.0470 0.074
0.0088 1.5433 126000 0.839 0.0473 0.073
0.0087 1.5678 128000 0.841 0.0457 0.073
0.0086 1.5923 130000 0.843 0.0453 0.072
0.0085 1.6168 132000 0.841 0.0471 0.073
0.0086 1.6413 134000 0.842 0.0471 0.072
0.0086 1.6658 136000 0.844 0.0446 0.072
0.0082 1.6903 138000 0.844 0.0458 0.071
0.008 1.7148 140000 0.845 0.0460 0.071
0.0078 1.7393 142000 0.846 0.0460 0.071
0.008 1.7638 144000 0.846 0.0456 0.07
0.0077 1.7883 146000 0.847 0.0461 0.071
0.0077 1.8127 148000 0.847 0.0460 0.07
0.0077 1.8372 150000 0.847 0.0464 0.07
0.0076 1.8617 152000 0.847 0.0463 0.07
0.0076 1.8862 154000 0.847 0.0462 0.07
0.0076 1.9107 156000 0.847 0.0461 0.07

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
611M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Virus-Proton/Bn_GEDC

Finetuned
(206)
this model