mms-1b-lozgen-balanced-model

This model is a fine-tuned version of facebook/mms-1b-all on the LOZGEN - TOI dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5419
  • Wer: 0.3662

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
6.6518 0.8130 100 3.1671 0.9943
2.6718 1.6260 200 2.2609 0.9316
1.4567 2.4390 300 0.7404 0.7192
0.7044 3.2520 400 0.6402 0.5321
0.6221 4.0650 500 0.6065 0.5166
0.6016 4.8780 600 0.5948 0.4787
0.5686 5.6911 700 0.5806 0.4641
0.6054 6.5041 800 0.5716 0.4486
0.4871 7.3171 900 0.5732 0.4446
0.5275 8.1301 1000 0.5667 0.4350
0.5199 8.9431 1100 0.5688 0.4300
0.5031 9.7561 1200 0.5516 0.4443
0.4533 10.5691 1300 0.5577 0.4179
0.4738 11.3821 1400 0.5536 0.4057
0.4925 12.1951 1500 0.5503 0.3969
0.441 13.0081 1600 0.5403 0.4005
0.4177 13.8211 1700 0.5563 0.3914
0.4589 14.6341 1800 0.5394 0.3876
0.4131 15.4472 1900 0.5425 0.3957
0.393 16.2602 2000 0.5469 0.3907
0.4235 17.0732 2100 0.5357 0.3878
0.4113 17.8862 2200 0.5391 0.3802
0.3781 18.6992 2300 0.5324 0.3728
0.3706 19.5122 2400 0.5463 0.3826
0.3617 20.3252 2500 0.5391 0.3697
0.401 21.1382 2600 0.5417 0.3664

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for csikasote/mms-1b-lozgen-balanced-model

Finetuned
(185)
this model