nhi_heldout-speaker-exp_JJG503_mms-1b-nhi-adapterft

This model is a fine-tuned version of facebook/mms-1b-all on the audiofolder dataset. It achieves the following results on the evaluation set:

Loss: 0.9551
Wer: 0.5099
Cer: 0.1636

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.9259	1.6807	200	1.0631	0.7246	0.2364
0.7325	3.3613	400	0.9573	0.6709	0.2154
0.6496	5.0420	600	0.9066	0.6591	0.2078
0.6303	6.7227	800	0.8995	0.6168	0.1959
0.576	8.4034	1000	0.8594	0.6016	0.1945
0.5455	10.0840	1200	0.7946	0.5847	0.1838
0.5304	11.7647	1400	0.8018	0.5879	0.1833
0.507	13.4454	1600	0.8205	0.5863	0.1883
0.4872	15.1261	1800	0.8448	0.5805	0.1846
0.4867	16.8067	2000	0.8381	0.5782	0.1834
0.4449	18.4874	2200	0.7953	0.5819	0.1827
0.4197	20.1681	2400	0.7872	0.5683	0.1796
0.4286	21.8487	2600	0.7965	0.5479	0.1729
0.4008	23.5294	2800	0.7981	0.5492	0.1729
0.4076	25.2101	3000	0.7909	0.5505	0.1726
0.3888	26.8908	3200	0.7650	0.5581	0.1754
0.3583	28.5714	3400	0.7871	0.5387	0.1702
0.3583	30.2521	3600	0.8008	0.5582	0.1722
0.3613	31.9328	3800	0.8101	0.5522	0.1720
0.3337	33.6134	4000	0.7855	0.5392	0.1667
0.3377	35.2941	4200	0.8145	0.5377	0.1656
0.3176	36.9748	4400	0.8048	0.5357	0.1679
0.2971	38.6555	4600	0.8438	0.5390	0.1713
0.3156	40.3361	4800	0.8106	0.5308	0.1688
0.311	42.0168	5000	0.8293	0.5310	0.1699
0.2884	43.6975	5200	0.8418	0.5367	0.1709
0.2898	45.3782	5400	0.8149	0.5399	0.1715
0.271	47.0588	5600	0.8387	0.5292	0.1650
0.276	48.7395	5800	0.8732	0.5345	0.1677
0.2625	50.4202	6000	0.8321	0.5310	0.1667
0.2632	52.1008	6200	0.8382	0.5252	0.1645
0.2462	53.7815	6400	0.8292	0.5270	0.1666
0.249	55.4622	6600	0.8642	0.5308	0.1682
0.2489	57.1429	6800	0.9214	0.5278	0.1692
0.2445	58.8235	7000	0.8832	0.5326	0.1679
0.2391	60.5042	7200	0.8951	0.5199	0.1678
0.2294	62.1849	7400	0.8613	0.5209	0.1649
0.2242	63.8655	7600	0.8602	0.5178	0.1650
0.2271	65.5462	7800	0.8963	0.5224	0.1690
0.217	67.2269	8000	0.8601	0.5171	0.1648
0.2099	68.9076	8200	0.8603	0.5088	0.1640
0.2097	70.5882	8400	0.8710	0.5166	0.1641
0.2075	72.2689	8600	0.8921	0.5190	0.1637
0.1994	73.9496	8800	0.8738	0.5070	0.1620
0.1962	75.6303	9000	0.8713	0.5109	0.1629
0.194	77.3109	9200	0.8724	0.5187	0.1634
0.1864	78.9916	9400	0.9267	0.5227	0.1648
0.187	80.6723	9600	0.9252	0.5146	0.1649
0.1799	82.3529	9800	0.9085	0.5152	0.1642
0.1868	84.0336	10000	0.9019	0.5139	0.1623
0.1694	85.7143	10200	0.9344	0.5174	0.1646
0.1754	87.3950	10400	0.9643	0.5121	0.1636
0.1736	89.0756	10600	0.9524	0.5130	0.1645
0.1652	90.7563	10800	0.9473	0.5138	0.1649
0.1789	92.4370	11000	0.9439	0.5107	0.1635
0.1659	94.1176	11200	0.9515	0.5146	0.1645
0.1683	95.7983	11400	0.9558	0.5119	0.1631
0.163	97.4790	11600	0.9587	0.5119	0.1637
0.1596	99.1597	11800	0.9551	0.5099	0.1636

Framework versions

Transformers 4.41.2
Pytorch 2.4.0
Datasets 3.2.0
Tokenizers 0.19.1

Lguyogiro
/

nhi_heldout-speaker-exp_JJG503_mms-1b-nhi-adapterft

nhi_heldout-speaker-exp_JJG503_mms-1b-nhi-adapterft

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Lguyogiro/nhi_heldout-speaker-exp_JJG503_mms-1b-nhi-adapterft

Evaluation results