mms-1b-lozgen-balanced-model

This model is a fine-tuned version of facebook/mms-1b-all on the LOZGEN - TOI dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
6.6518	0.8130	100	3.1671	0.9943
2.6718	1.6260	200	2.2609	0.9316
1.4567	2.4390	300	0.7404	0.7192
0.7044	3.2520	400	0.6402	0.5321
0.6221	4.0650	500	0.6065	0.5166
0.6016	4.8780	600	0.5948	0.4787
0.5686	5.6911	700	0.5806	0.4641
0.6054	6.5041	800	0.5716	0.4486
0.4871	7.3171	900	0.5732	0.4446
0.5275	8.1301	1000	0.5667	0.4350
0.5199	8.9431	1100	0.5688	0.4300
0.5031	9.7561	1200	0.5516	0.4443
0.4533	10.5691	1300	0.5577	0.4179
0.4738	11.3821	1400	0.5536	0.4057
0.4925	12.1951	1500	0.5503	0.3969
0.441	13.0081	1600	0.5403	0.4005
0.4177	13.8211	1700	0.5563	0.3914
0.4589	14.6341	1800	0.5394	0.3876
0.4131	15.4472	1900	0.5425	0.3957
0.393	16.2602	2000	0.5469	0.3907
0.4235	17.0732	2100	0.5357	0.3878
0.4113	17.8862	2200	0.5391	0.3802
0.3781	18.6992	2300	0.5324	0.3728
0.3706	19.5122	2400	0.5463	0.3826
0.3617	20.3252	2500	0.5391	0.3697
0.401	21.1382	2600	0.5417	0.3664