mms-1b-swagen-baseline-model

This model is a fine-tuned version of facebook/mms-1b-all on the SWAGEN - SWA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2281
  • Wer: 0.1941

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
15.3971 0.2387 100 3.5940 1.0051
6.4342 0.4773 200 2.9924 0.9869
3.4197 0.7160 300 0.2737 0.2023
0.56 0.9547 400 0.2543 0.1962
0.5187 1.1933 500 0.2420 0.1929
0.5115 1.4320 600 0.2393 0.1947
0.5086 1.6706 700 0.2360 0.1892
0.4801 1.9093 800 0.2333 0.1874
0.5281 2.1480 900 0.2355 0.1958
0.4683 2.3866 1000 0.2378 0.1956
0.4548 2.6253 1100 0.2283 0.1874
0.4654 2.8640 1200 0.2323 0.1892
0.453 3.1026 1300 0.2288 0.1898
0.4542 3.3413 1400 0.2303 0.1902
0.4621 3.5800 1500 0.2253 0.1865
0.4342 3.8186 1600 0.2267 0.1869
0.466 4.0573 1700 0.2284 0.1898
0.4268 4.2959 1800 0.2325 0.1958
0.4283 4.5346 1900 0.2250 0.1886
0.4407 4.7733 2000 0.2250 0.1884
0.4762 5.0119 2100 0.2277 0.1894
0.4289 5.2506 2200 0.2225 0.1872
0.4391 5.4893 2300 0.2229 0.1884
0.4333 5.7279 2400 0.2229 0.1878
0.4351 5.9666 2500 0.2279 0.1902
0.4065 6.2053 2600 0.2280 0.1939

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
33
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for csikasote/mms-1b-swagen-baseline-model

Finetuned
(258)
this model