mms-1b-toigen-balanced-model

This model is a fine-tuned version of facebook/mms-1b-all on the TOIGEN - TOI dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 2500.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
14.2297	0.8850	100	3.4836	1.0056
4.1389	1.7699	200	0.5562	0.5694
1.3643	2.6549	300	0.4360	0.4958
1.1715	3.5398	400	0.3980	0.4824
1.1309	4.4248	500	0.3785	0.4583
1.0283	5.3097	600	0.3741	0.4477
1.0148	6.1947	700	0.3669	0.4403
0.9961	7.0796	800	0.3607	0.4356
0.9248	7.9646	900	0.3581	0.4236
0.9482	8.8496	1000	0.3463	0.4356
0.8815	9.7345	1100	0.3488	0.4273
0.8209	10.6195	1200	0.3384	0.4
0.8754	11.5044	1300	0.3459	0.4051
0.8454	12.3894	1400	0.3317	0.3884
0.8164	13.2743	1500	0.3319	0.4032
0.7673	14.1593	1600	0.3311	0.3921
0.7953	15.0442	1700	0.3333	0.3944
0.7527	15.9292	1800	0.3313	0.3917
0.763	16.8142	1900	0.3278	0.3931
0.7319	17.6991	2000	0.3234	0.3755
0.7352	18.5841	2100	0.3248	0.3806
0.7017	19.4690	2200	0.3334	0.3852
0.6902	20.3540	2300	0.3304	0.3889
0.707	21.2389	2400	0.3314	0.3856