mms-1b-lozgen-combined-model

This model is a fine-tuned version of facebook/mms-1b-all on the LOZGEN - LOZ dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
6.5686	0.4065	100	3.0827	0.9701
2.6223	0.8130	200	2.2379	0.9112
1.4386	1.2195	300	0.6910	0.7809
0.8073	1.6260	400	0.5903	0.5699
0.651	2.0325	500	0.5555	0.5037
0.655	2.4390	600	0.5298	0.4818
0.6579	2.8455	700	0.5298	0.4603
0.5699	3.2520	800	0.5160	0.4284
0.6104	3.6585	900	0.5070	0.4320
0.604	4.0650	1000	0.4978	0.4098
0.5681	4.4715	1100	0.4975	0.4072
0.5493	4.8780	1200	0.4878	0.4038
0.581	5.2846	1300	0.4826	0.3965
0.5746	5.6911	1400	0.4793	0.4242
0.5238	6.0976	1500	0.4724	0.3833
0.5204	6.5041	1600	0.4866	0.3864
0.5563	6.9106	1700	0.4672	0.3839
0.5121	7.3171	1800	0.4664	0.3719
0.4774	7.7236	1900	0.4625	0.3652
0.5356	8.1301	2000	0.4721	0.3693
0.4385	8.5366	2100	0.4560	0.3695
0.5561	8.9431	2200	0.4453	0.3594
0.414	9.3496	2300	0.4489	0.3546
0.4763	9.7561	2400	0.4525	0.3521
0.5317	10.1626	2500	0.4424	0.3557
0.4939	10.5691	2600	0.4398	0.3502
0.4456	10.9756	2700	0.4415	0.3467
0.4583	11.3821	2800	0.4502	0.3446
0.4573	11.7886	2900	0.4267	0.3403
0.398	12.1951	3000	0.4305	0.3406
0.472	12.6016	3100	0.4268	0.3320
0.3993	13.0081	3200	0.4288	0.3297