w2v-bert-2.0-nchlt

This model is a fine-tuned version of facebook/w2v-bert-2.0 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1815
  • Wer: 0.1258
  • Cer: 0.0237

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
2.2833 0.2293 300 0.3303 0.3651 0.0594
0.2781 0.4585 600 0.2465 0.3157 0.0516
0.2232 0.6878 900 0.2155 0.2855 0.0461
0.1987 0.9171 1200 0.1888 0.2507 0.0418
0.175 1.1460 1500 0.1845 0.2291 0.0403
0.1573 1.3752 1800 0.1599 0.2058 0.0378
0.1478 1.6045 2100 0.1527 0.1901 0.0319
0.1395 1.8338 2400 0.1483 0.1912 0.0319
0.137 2.0627 2700 0.1446 0.1740 0.0306
0.1127 2.2919 3000 0.1401 0.1798 0.0303
0.1123 2.5212 3300 0.1340 0.1795 0.0315
0.1137 2.7505 3600 0.1324 0.1717 0.0302
0.1124 2.9797 3900 0.1281 0.1720 0.0294
0.0937 3.2086 4200 0.1228 0.1568 0.0276
0.0898 3.4379 4500 0.1238 0.1578 0.0272
0.0912 3.6672 4800 0.1209 0.1687 0.0286
0.0921 3.8964 5100 0.1218 0.1640 0.0276
0.0774 4.1253 5400 0.1228 0.1731 0.0295
0.0725 4.3546 5700 0.1213 0.1546 0.0273
0.0746 4.5839 6000 0.1237 0.1525 0.0273
0.0714 4.8131 6300 0.1189 0.1461 0.0260
0.068 5.0420 6600 0.1257 0.1540 0.0264
0.0519 5.2713 6900 0.1228 0.1511 0.0261
0.0553 5.5006 7200 0.1243 0.1477 0.0258
0.0575 5.7298 7500 0.1196 0.1438 0.0256
0.0561 5.9591 7800 0.1120 0.1407 0.0249
0.0415 6.1880 8100 0.1288 0.1414 0.0262
0.0392 6.4173 8400 0.1321 0.1369 0.0247
0.0402 6.6465 8700 0.1227 0.1415 0.0257
0.0375 6.8758 9000 0.1227 0.1382 0.0254
0.0322 7.1047 9300 0.1393 0.1412 0.0262
0.0245 7.3340 9600 0.1395 0.1355 0.0249
0.0249 7.5632 9900 0.1365 0.1342 0.0243
0.0238 7.7925 10200 0.1394 0.1382 0.0254
0.0247 8.0214 10500 0.1536 0.1310 0.0247
0.0131 8.2507 10800 0.1474 0.1350 0.0249
0.013 8.4799 11100 0.1619 0.1325 0.0244
0.0123 8.7092 11400 0.1564 0.1291 0.0240
0.0123 8.9385 11700 0.1539 0.1272 0.0239
0.0069 9.1674 12000 0.1716 0.1268 0.0236
0.0055 9.3966 12300 0.1795 0.1257 0.0238
0.0052 9.6259 12600 0.1823 0.1248 0.0236
0.0051 9.8552 12900 0.1815 0.1258 0.0237

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
42
Safetensors
Model size
606M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for aconeil/w2v-bert-2.0-nchlt

Finetuned
(255)
this model