w2v-bert-2.0-nchlt_mdd

This model is a fine-tuned version of facebook/w2v-bert-2.0 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1305
  • Wer: 0.1407
  • Cer: 0.0252

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
2.3911 0.2164 300 0.3147 0.3557 0.0617
0.3253 0.4327 600 0.2722 0.3179 0.0540
0.2547 0.6491 900 0.2391 0.3374 0.0517
0.2227 0.8655 1200 0.1984 0.2551 0.0436
0.1826 1.0815 1500 0.1704 0.2311 0.0377
0.1729 1.2979 1800 0.1768 0.2304 0.0410
0.1567 1.5142 2100 0.1554 0.2162 0.0355
0.1563 1.7306 2400 0.1515 0.2037 0.0347
0.1376 1.9470 2700 0.1525 0.2068 0.0362
0.1266 2.1630 3000 0.1433 0.1840 0.0334
0.119 2.3794 3300 0.1403 0.1901 0.0317
0.1148 2.5957 3600 0.1424 0.1753 0.0307
0.1192 2.8121 3900 0.1401 0.1800 0.0334
0.1051 3.0281 4200 0.1349 0.1744 0.0294
0.0941 3.2445 4500 0.1284 0.1732 0.0287
0.0887 3.4609 4800 0.1319 0.1624 0.0288
0.093 3.6772 5100 0.1322 0.1616 0.0286
0.0892 3.8936 5400 0.1309 0.1649 0.0282
0.0879 4.1096 5700 0.1318 0.1761 0.0296
0.0769 4.3260 6000 0.1219 0.1535 0.0268
0.0794 4.5424 6300 0.1214 0.1518 0.0267
0.0741 4.7587 6600 0.1192 0.1532 0.0267
0.0745 4.9751 6900 0.1210 0.1622 0.0278
0.0621 5.1911 7200 0.1205 0.1509 0.0265
0.0586 5.4075 7500 0.1197 0.1426 0.0259
0.0596 5.6239 7800 0.1177 0.1426 0.0250
0.0604 5.8402 8100 0.1224 0.1471 0.0262
0.0569 6.0563 8400 0.1241 0.1453 0.0254
0.0464 6.2726 8700 0.1560 0.1665 0.0342
0.0465 6.4890 9000 0.1279 0.1425 0.0253
0.047 6.7054 9300 0.1289 0.1426 0.0266
0.0501 6.9217 9600 0.1239 0.1413 0.0261
0.0415 7.1378 9900 0.1286 0.1413 0.0257
0.0377 7.3541 10200 0.1332 0.1387 0.0252
0.0358 7.5705 10500 0.1368 0.1421 0.0257
0.0399 7.7869 10800 0.1261 0.1453 0.0261
0.0407 8.0029 11100 0.1274 0.1324 0.0238
0.0278 8.2193 11400 0.1292 0.1385 0.0249
0.0353 8.4356 11700 0.1344 0.1342 0.0246
0.0318 8.6520 12000 0.1322 0.1420 0.0262
0.0319 8.8684 12300 0.1361 0.1416 0.0265
0.031 9.0844 12600 0.1353 0.1409 0.0257
0.0284 9.3008 12900 0.1307 0.1436 0.0260
0.0304 9.5171 13200 0.1345 0.1466 0.0265
0.0308 9.7335 13500 0.1327 0.1448 0.0259
0.0314 9.9499 13800 0.1305 0.1407 0.0252

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
20
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for aconeil/w2v-bert-2.0-nchlt_mdd

Finetuned
(252)
this model