Visualize in Weights & Biases

bambara_mms_10_hour_mixed_dataset

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2512
  • Wer: 0.52
  • Cer: 0.3632

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Wer Cer
1.9455 0.8482 500 1.5056 0.8112 0.4290
1.5004 1.6964 1000 1.3041 0.7323 0.3374
1.3813 2.5445 1500 1.2313 0.7115 0.3728
1.3102 3.3927 2000 1.1950 0.7120 0.4489
1.2181 4.2409 2500 1.1980 0.6981 0.4080
1.174 5.0891 3000 1.1699 0.7216 0.3960
1.1191 5.9372 3500 1.1130 0.7440 0.4183
1.0556 6.7854 4000 1.0874 0.6244 0.3241
1.0105 7.6336 4500 1.0767 0.6353 0.3932
0.9775 8.4818 5000 1.1265 0.6319 0.3856
0.9283 9.3299 5500 1.1483 0.6483 0.4394
0.8955 10.1781 6000 1.0845 0.6544 0.4310
0.852 11.0263 6500 1.0088 0.5970 0.3317
0.7987 11.8745 7000 1.0797 0.6010 0.3611
0.7569 12.7226 7500 1.0715 0.6100 0.3884
0.7299 13.5708 8000 1.1275 0.6071 0.3978
0.6995 14.4190 8500 1.1741 0.6209 0.4731
0.6671 15.2672 9000 1.0855 0.5953 0.3887
0.6431 16.1154 9500 1.1793 0.5662 0.3377
0.612 16.9635 10000 1.1662 0.5778 0.3876
0.5784 17.8117 10500 1.1753 0.5764 0.3820
0.5501 18.6599 11000 1.2029 0.5832 0.3877
0.5286 19.5081 11500 1.3072 0.6082 0.4344
0.5066 20.3562 12000 1.1977 0.5755 0.3815
0.4812 21.2044 12500 1.2332 0.5624 0.3667
0.4609 22.0526 13000 1.3325 0.5465 0.3521
0.4338 22.9008 13500 1.3214 0.5512 0.3628
0.4244 23.7489 14000 1.4046 0.5612 0.3858
0.3963 24.5971 14500 1.4522 0.5704 0.3985
0.3844 25.4453 15000 1.3522 0.5706 0.3945
0.3665 26.2935 15500 1.3853 0.5391 0.3524
0.3494 27.1416 16000 1.5375 0.5476 0.3784
0.3338 27.9898 16500 1.4892 0.5563 0.3732
0.3172 28.8380 17000 1.5445 0.5500 0.3761
0.308 29.6862 17500 1.6170 0.5530 0.3821
0.2871 30.5344 18000 1.6431 0.5499 0.3889
0.2724 31.3825 18500 1.6469 0.5362 0.3614
0.2653 32.2307 19000 1.6854 0.5428 0.3648
0.2505 33.0789 19500 1.7214 0.5413 0.3654
0.2405 33.9271 20000 1.7085 0.5550 0.3809
0.2304 34.7752 20500 1.7357 0.5467 0.3772
0.2259 35.6234 21000 1.7828 0.5465 0.3799
0.2111 36.4716 21500 1.8705 0.5350 0.3678
0.2014 37.3198 22000 1.8758 0.5361 0.3682
0.2016 38.1679 22500 1.9686 0.5344 0.3842
0.1884 39.0161 23000 1.9711 0.5288 0.3742
0.1842 39.8643 23500 1.9821 0.5337 0.3827
0.1745 40.7125 24000 1.9664 0.5262 0.3730
0.1665 41.5606 24500 2.0731 0.5327 0.3733
0.1639 42.4088 25000 2.1357 0.5286 0.3694
0.1536 43.2570 25500 2.0855 0.5290 0.3640
0.1532 44.1052 26000 2.1890 0.5238 0.3635
0.1443 44.9534 26500 2.1638 0.5296 0.3666
0.1428 45.8015 27000 2.1495 0.5232 0.3624
0.1377 46.6497 27500 2.2047 0.5234 0.3580
0.1348 47.4979 28000 2.2385 0.5215 0.3651
0.1285 48.3461 28500 2.2492 0.5203 0.3650
0.1303 49.1942 29000 2.2512 0.52 0.3632

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.17.0
  • Tokenizers 0.20.3
Downloads last month
1
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for asr-africa/bambara_mms_10_hour_mixed_dataset

Finetuned
(242)
this model