mms-1b-lozgen-combined-model

This model is a fine-tuned version of facebook/mms-1b-all on the LOZGEN - LOZ dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4288
  • Wer: 0.3297

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
6.5686 0.4065 100 3.0827 0.9701
2.6223 0.8130 200 2.2379 0.9112
1.4386 1.2195 300 0.6910 0.7809
0.8073 1.6260 400 0.5903 0.5699
0.651 2.0325 500 0.5555 0.5037
0.655 2.4390 600 0.5298 0.4818
0.6579 2.8455 700 0.5298 0.4603
0.5699 3.2520 800 0.5160 0.4284
0.6104 3.6585 900 0.5070 0.4320
0.604 4.0650 1000 0.4978 0.4098
0.5681 4.4715 1100 0.4975 0.4072
0.5493 4.8780 1200 0.4878 0.4038
0.581 5.2846 1300 0.4826 0.3965
0.5746 5.6911 1400 0.4793 0.4242
0.5238 6.0976 1500 0.4724 0.3833
0.5204 6.5041 1600 0.4866 0.3864
0.5563 6.9106 1700 0.4672 0.3839
0.5121 7.3171 1800 0.4664 0.3719
0.4774 7.7236 1900 0.4625 0.3652
0.5356 8.1301 2000 0.4721 0.3693
0.4385 8.5366 2100 0.4560 0.3695
0.5561 8.9431 2200 0.4453 0.3594
0.414 9.3496 2300 0.4489 0.3546
0.4763 9.7561 2400 0.4525 0.3521
0.5317 10.1626 2500 0.4424 0.3557
0.4939 10.5691 2600 0.4398 0.3502
0.4456 10.9756 2700 0.4415 0.3467
0.4583 11.3821 2800 0.4502 0.3446
0.4573 11.7886 2900 0.4267 0.3403
0.398 12.1951 3000 0.4305 0.3406
0.472 12.6016 3100 0.4268 0.3320
0.3993 13.0081 3200 0.4288 0.3297

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
6
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for csikasote/mms-1b-lozgen-combined-model

Finetuned
(245)
this model