mms-300m-sakha

This model is a fine-tuned version of facebook/mms-300m on the common_voice_13_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.3105
Wer: 0.3059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
8.3074	1.0	111	4.1552	1.0
3.6543	2.0	222	3.2635	1.0
3.109	3.0	333	2.9604	1.0
2.221	4.0	444	0.9272	0.7549
0.6842	5.0	555	0.4823	0.5726
0.4123	6.0	666	0.3828	0.5006
0.3021	7.0	777	0.3563	0.4868
0.2589	8.0	888	0.3188	0.4482
0.2246	9.0	999	0.3108	0.4430
0.1896	10.0	1110	0.3100	0.4130
0.1695	11.0	1221	0.2926	0.4104
0.1528	12.0	1332	0.2906	0.4133
0.1385	13.0	1443	0.2815	0.3931
0.1267	14.0	1554	0.3070	0.3966
0.1194	15.0	1665	0.2917	0.3877
0.1102	16.0	1776	0.2896	0.3805
0.1056	17.0	1887	0.2768	0.3793
0.099	18.0	1998	0.2910	0.3782
0.0897	19.0	2109	0.3145	0.3793
0.0876	20.0	2220	0.3028	0.3710
0.0878	21.0	2331	0.2956	0.3744
0.0877	22.0	2442	0.2894	0.3730
0.0851	23.0	2553	0.3086	0.3805
0.0825	24.0	2664	0.3168	0.3744
0.0765	25.0	2775	0.3113	0.3615
0.0778	26.0	2886	0.3204	0.3744
0.0777	27.0	2997	0.3257	0.3727
0.0752	28.0	3108	0.3118	0.3612
0.0736	29.0	3219	0.3159	0.3638
0.0677	30.0	3330	0.2975	0.3540
0.0663	31.0	3441	0.3080	0.3548
0.0655	32.0	3552	0.3223	0.3597
0.0658	33.0	3663	0.3215	0.3571
0.0664	34.0	3774	0.3164	0.3733
0.0635	35.0	3885	0.3239	0.3586
0.0621	36.0	3996	0.3188	0.3586
0.06	37.0	4107	0.2937	0.3563
0.0572	38.0	4218	0.3262	0.3620
0.0576	39.0	4329	0.3097	0.3505
0.0571	40.0	4440	0.3086	0.3580
0.0559	41.0	4551	0.3257	0.3641
0.0581	42.0	4662	0.3245	0.3537
0.0542	43.0	4773	0.3193	0.3612
0.0516	44.0	4884	0.2950	0.3531
0.0553	45.0	4995	0.3261	0.3522
0.0508	46.0	5106	0.3347	0.3563
0.0478	47.0	5217	0.3229	0.3600
0.0468	48.0	5328	0.3134	0.3482
0.0478	49.0	5439	0.3087	0.3491
0.045	50.0	5550	0.3103	0.3361
0.0485	51.0	5661	0.3148	0.3476
0.0438	52.0	5772	0.3138	0.3448
0.0444	53.0	5883	0.3151	0.3407
0.0447	54.0	5994	0.2992	0.3355
0.0439	55.0	6105	0.3165	0.3436
0.0413	56.0	6216	0.3184	0.3384
0.0394	57.0	6327	0.3217	0.3404
0.0413	58.0	6438	0.3062	0.3315
0.0386	59.0	6549	0.2985	0.3255
0.039	60.0	6660	0.3125	0.3407
0.038	61.0	6771	0.2937	0.3381
0.0361	62.0	6882	0.3138	0.3318
0.0359	63.0	6993	0.3296	0.3315
0.0347	64.0	7104	0.3260	0.3355
0.036	65.0	7215	0.3003	0.3373
0.0366	66.0	7326	0.2967	0.3283
0.0321	67.0	7437	0.3035	0.3240
0.0308	68.0	7548	0.3335	0.3390
0.0311	69.0	7659	0.3096	0.3263
0.0325	70.0	7770	0.3164	0.3306
0.032	71.0	7881	0.2890	0.3211
0.0312	72.0	7992	0.2847	0.3194
0.0289	73.0	8103	0.2904	0.3200
0.0289	74.0	8214	0.2932	0.3174
0.0276	75.0	8325	0.2921	0.3168
0.0277	76.0	8436	0.3054	0.3200
0.0271	77.0	8547	0.3078	0.3197
0.0261	78.0	8658	0.3191	0.3220
0.0268	79.0	8769	0.3081	0.3211
0.0251	80.0	8880	0.3089	0.3142
0.0245	81.0	8991	0.3081	0.3151
0.0229	82.0	9102	0.3124	0.3148
0.0232	83.0	9213	0.3074	0.3142
0.0241	84.0	9324	0.3045	0.3111
0.0213	85.0	9435	0.3234	0.3131
0.0215	86.0	9546	0.3148	0.3105
0.0209	87.0	9657	0.3160	0.3134
0.0208	88.0	9768	0.3055	0.3099
0.0201	89.0	9879	0.2996	0.3065
0.0196	90.0	9990	0.3036	0.3073
0.0187	91.0	10101	0.3137	0.3111
0.0189	92.0	10212	0.3089	0.3067
0.0184	93.0	10323	0.3118	0.3113
0.0172	94.0	10434	0.3081	0.3105
0.018	95.0	10545	0.3108	0.3099
0.0164	96.0	10656	0.3081	0.3073
0.0175	97.0	10767	0.3100	0.3082
0.0159	98.0	10878	0.3124	0.3056
0.0181	99.0	10989	0.3093	0.3044
0.0161	100.0	11100	0.3105	0.3059

Framework versions

Transformers 4.32.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

volodya-leveryev
/

mms-300m-sakha

mms-300m-sakha

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for volodya-leveryev/mms-300m-sakha

Evaluation results