w2v-bert-2.0-nchlt_mdd

This model is a fine-tuned version of facebook/w2v-bert-2.0 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1305
Wer: 0.1407
Cer: 0.0252

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
2.3911	0.2164	300	0.3147	0.3557	0.0617
0.3253	0.4327	600	0.2722	0.3179	0.0540
0.2547	0.6491	900	0.2391	0.3374	0.0517
0.2227	0.8655	1200	0.1984	0.2551	0.0436
0.1826	1.0815	1500	0.1704	0.2311	0.0377
0.1729	1.2979	1800	0.1768	0.2304	0.0410
0.1567	1.5142	2100	0.1554	0.2162	0.0355
0.1563	1.7306	2400	0.1515	0.2037	0.0347
0.1376	1.9470	2700	0.1525	0.2068	0.0362
0.1266	2.1630	3000	0.1433	0.1840	0.0334
0.119	2.3794	3300	0.1403	0.1901	0.0317
0.1148	2.5957	3600	0.1424	0.1753	0.0307
0.1192	2.8121	3900	0.1401	0.1800	0.0334
0.1051	3.0281	4200	0.1349	0.1744	0.0294
0.0941	3.2445	4500	0.1284	0.1732	0.0287
0.0887	3.4609	4800	0.1319	0.1624	0.0288
0.093	3.6772	5100	0.1322	0.1616	0.0286
0.0892	3.8936	5400	0.1309	0.1649	0.0282
0.0879	4.1096	5700	0.1318	0.1761	0.0296
0.0769	4.3260	6000	0.1219	0.1535	0.0268
0.0794	4.5424	6300	0.1214	0.1518	0.0267
0.0741	4.7587	6600	0.1192	0.1532	0.0267
0.0745	4.9751	6900	0.1210	0.1622	0.0278
0.0621	5.1911	7200	0.1205	0.1509	0.0265
0.0586	5.4075	7500	0.1197	0.1426	0.0259
0.0596	5.6239	7800	0.1177	0.1426	0.0250
0.0604	5.8402	8100	0.1224	0.1471	0.0262
0.0569	6.0563	8400	0.1241	0.1453	0.0254
0.0464	6.2726	8700	0.1560	0.1665	0.0342
0.0465	6.4890	9000	0.1279	0.1425	0.0253
0.047	6.7054	9300	0.1289	0.1426	0.0266
0.0501	6.9217	9600	0.1239	0.1413	0.0261
0.0415	7.1378	9900	0.1286	0.1413	0.0257
0.0377	7.3541	10200	0.1332	0.1387	0.0252
0.0358	7.5705	10500	0.1368	0.1421	0.0257
0.0399	7.7869	10800	0.1261	0.1453	0.0261
0.0407	8.0029	11100	0.1274	0.1324	0.0238
0.0278	8.2193	11400	0.1292	0.1385	0.0249
0.0353	8.4356	11700	0.1344	0.1342	0.0246
0.0318	8.6520	12000	0.1322	0.1420	0.0262
0.0319	8.8684	12300	0.1361	0.1416	0.0265
0.031	9.0844	12600	0.1353	0.1409	0.0257
0.0284	9.3008	12900	0.1307	0.1436	0.0260
0.0304	9.5171	13200	0.1345	0.1466	0.0265
0.0308	9.7335	13500	0.1327	0.1448	0.0259
0.0314	9.9499	13800	0.1305	0.1407	0.0252

Framework versions

Transformers 4.48.1
Pytorch 2.6.0+cu124
Datasets 3.2.0
Tokenizers 0.21.0

aconeil
/

w2v-bert-2.0-nchlt_mdd

w2v-bert-2.0-nchlt_mdd

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for aconeil/w2v-bert-2.0-nchlt_mdd

Evaluation results