csikasote
/

mms-1b-lozgen-combined-model

+---
+library_name: transformers
+license: cc-by-nc-4.0
+base_model: facebook/mms-1b-all
+tags:
+- generated_from_trainer
+metrics:
+- wer
+model-index:
+- name: mms-1b-lozgen-combined-model
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mms-1b-lozgen-combined-model
+This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4288
+- Wer: 0.3297
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0003
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 30.0
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch   | Step | Validation Loss | Wer    |
+|:-------------:|:-------:|:----:|:---------------:|:------:|
+| 6.5686        | 0.4065  | 100  | 3.0827          | 0.9701 |
+| 2.6223        | 0.8130  | 200  | 2.2379          | 0.9112 |
+| 1.4386        | 1.2195  | 300  | 0.6910          | 0.7809 |
+| 0.8073        | 1.6260  | 400  | 0.5903          | 0.5699 |
+| 0.651         | 2.0325  | 500  | 0.5555          | 0.5037 |
+| 0.655         | 2.4390  | 600  | 0.5298          | 0.4818 |
+| 0.6579        | 2.8455  | 700  | 0.5298          | 0.4603 |
+| 0.5699        | 3.2520  | 800  | 0.5160          | 0.4284 |
+| 0.6104        | 3.6585  | 900  | 0.5070          | 0.4320 |
+| 0.604         | 4.0650  | 1000 | 0.4978          | 0.4098 |
+| 0.5681        | 4.4715  | 1100 | 0.4975          | 0.4072 |
+| 0.5493        | 4.8780  | 1200 | 0.4878          | 0.4038 |
+| 0.581         | 5.2846  | 1300 | 0.4826          | 0.3965 |
+| 0.5746        | 5.6911  | 1400 | 0.4793          | 0.4242 |
+| 0.5238        | 6.0976  | 1500 | 0.4724          | 0.3833 |
+| 0.5204        | 6.5041  | 1600 | 0.4866          | 0.3864 |
+| 0.5563        | 6.9106  | 1700 | 0.4672          | 0.3839 |
+| 0.5121        | 7.3171  | 1800 | 0.4664          | 0.3719 |
+| 0.4774        | 7.7236  | 1900 | 0.4625          | 0.3652 |
+| 0.5356        | 8.1301  | 2000 | 0.4721          | 0.3693 |
+| 0.4385        | 8.5366  | 2100 | 0.4560          | 0.3695 |
+| 0.5561        | 8.9431  | 2200 | 0.4453          | 0.3594 |
+| 0.414         | 9.3496  | 2300 | 0.4489          | 0.3546 |
+| 0.4763        | 9.7561  | 2400 | 0.4525          | 0.3521 |
+| 0.5317        | 10.1626 | 2500 | 0.4424          | 0.3557 |
+| 0.4939        | 10.5691 | 2600 | 0.4398          | 0.3502 |
+| 0.4456        | 10.9756 | 2700 | 0.4415          | 0.3467 |
+| 0.4583        | 11.3821 | 2800 | 0.4502          | 0.3446 |
+| 0.4573        | 11.7886 | 2900 | 0.4267          | 0.3403 |
+| 0.398         | 12.1951 | 3000 | 0.4305          | 0.3406 |
+| 0.472         | 12.6016 | 3100 | 0.4268          | 0.3320 |
+| 0.3993        | 13.0081 | 3200 | 0.4288          | 0.3297 |
+### Framework versions
+- Transformers 4.48.0.dev0
+- Pytorch 2.5.1+cu124
+- Datasets 3.2.0
+- Tokenizers 0.21.0