Hubert-common_voice_JSUT-ja-demo-japanese

This model is a fine-tuned version of rinna/japanese-hubert-base on the MOZILLA-FOUNDATION/COMMON_VOICE_13_0 - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1568
  • Wer: 1.9920
  • Cer: 0.6415

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12500
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
No log 0.1934 100 84.6958 1.0115 8.4850
No log 0.3868 200 83.7886 1.0090 8.3443
No log 0.5803 300 81.7457 1.0004 4.8157
No log 0.7737 400 75.4304 1.0 0.9907
66.0277 0.9671 500 63.1251 1.0 0.9907
66.0277 1.1605 600 57.1050 1.0 0.9907
66.0277 1.3540 700 55.6799 1.0 0.9908
66.0277 1.5474 800 55.0476 1.0 0.9907
66.0277 1.7408 900 54.4085 1.0 0.9907
46.3141 1.9342 1000 53.6893 1.0 0.9908
46.3141 2.1277 1100 52.9711 1.0 0.9907
46.3141 2.3211 1200 52.1326 1.0 0.9907
46.3141 2.5145 1300 51.2549 1.0 0.9907
46.3141 2.7079 1400 50.2649 1.0 0.9907
42.8642 2.9014 1500 49.2081 1.0 0.9907
42.8642 3.0948 1600 48.1051 1.0 0.9907
42.8642 3.2882 1700 46.8788 1.0 0.9907
42.8642 3.4816 1800 45.5411 1.0 0.9907
42.8642 3.6750 1900 44.1516 1.0 0.9907
38.3378 3.8685 2000 42.6087 1.0 0.9907
38.3378 4.0619 2100 40.9815 1.0 0.9907
38.3378 4.2553 2200 39.2401 1.0 0.9907
38.3378 4.4487 2300 37.4022 1.0 0.9908
38.3378 4.6422 2400 35.4309 1.0 0.9907
31.9192 4.8356 2500 33.4175 1.0 0.9907
31.9192 5.0290 2600 31.2660 1.0 0.9907
31.9192 5.2224 2700 29.0147 1.0 0.9908
31.9192 5.4159 2800 26.6885 1.0 0.9907
31.9192 5.6093 2900 24.3010 1.0 0.9907
23.4284 5.8027 3000 21.8808 1.0 0.9907
23.4284 5.9961 3100 19.4735 1.0 0.9908
23.4284 6.1896 3200 17.1293 1.0 0.9909
23.4284 6.3830 3300 14.8638 1.0 0.9908
23.4284 6.5764 3400 12.8062 1.0 0.9907
13.9431 6.7698 3500 10.9643 1.0 0.9907
13.9431 6.9632 3600 9.4119 1.0 0.9907
13.9431 7.1567 3700 8.1640 1.0 0.9907
13.9431 7.3501 3800 7.2297 1.0 0.9907
13.9431 7.5435 3900 6.5716 1.0 0.9907
7.4585 7.7369 4000 6.1413 1.0 0.9907
7.4585 7.9304 4100 5.8854 1.0 0.9907
7.4585 8.1238 4200 5.7707 1.0 0.9907
7.4585 8.3172 4300 5.6802 1.0 0.9907
7.4585 8.5106 4400 5.5971 1.0 0.9907
5.7398 8.7041 4500 5.5333 1.0 0.9907
5.7398 8.8975 4600 5.4751 1.0 0.9907
5.7398 9.0909 4700 5.4254 1.0 0.9907
5.7398 9.2843 4800 5.3775 1.1319 0.9908
5.7398 9.4778 4900 5.3433 1.3321 0.9907
5.4159 9.6712 5000 5.3119 1.6862 0.9906
5.4159 9.8646 5100 5.2691 1.4255 0.9910
5.4159 10.0580 5200 5.2369 1.4043 0.9909
5.4159 10.2515 5300 5.1949 1.5686 0.9910
5.4159 10.4449 5400 5.1519 1.5166 0.9908
5.2163 10.6383 5500 5.1081 1.2477 0.9910
5.2163 10.8317 5600 5.0553 1.5124 0.9908
5.2163 11.0251 5700 5.0123 1.5496 0.9909
5.2163 11.2186 5800 4.9424 1.7622 0.9886
5.2163 11.4120 5900 4.8753 1.5404 0.9831
4.9465 11.6054 6000 4.7768 1.8535 0.9750
4.9465 11.7988 6100 4.6841 1.8396 0.9713
4.9465 11.9923 6200 4.5828 1.7444 0.9697
4.9465 12.1857 6300 4.4853 1.8013 0.9689
4.9465 12.3791 6400 4.3955 1.8278 0.9556
4.5094 12.5725 6500 4.2842 1.8729 0.9123
4.5094 12.7660 6600 4.1819 1.9094 0.8650
4.5094 12.9594 6700 4.0741 1.9135 0.8486
4.5094 13.1528 6800 3.9649 1.9191 0.8386
4.5094 13.3462 6900 3.8641 1.9195 0.8189
4.0097 13.5397 7000 3.7687 1.9276 0.8014
4.0097 13.7331 7100 3.6808 1.9259 0.7963
4.0097 13.9265 7200 3.6021 1.9276 0.7792
4.0097 14.1199 7300 3.5533 1.9367 0.7775
4.0097 14.3133 7400 3.4768 1.9321 0.7751
3.5619 14.5068 7500 3.4285 1.9385 0.7672
3.5619 14.7002 7600 3.3628 1.9362 0.7660
3.5619 14.8936 7700 3.2910 1.9314 0.7619
3.5619 15.0870 7800 3.2243 1.9288 0.7486
3.5619 15.2805 7900 3.1645 1.9308 0.7432
3.2379 15.4739 8000 3.1186 1.9332 0.7383
3.2379 15.6673 8100 3.0783 1.9349 0.7375
3.2379 15.8607 8200 3.0146 1.9321 0.7279
3.2379 16.0542 8300 2.9523 1.9308 0.7300
3.2379 16.2476 8400 2.9187 1.9274 0.7254
2.9448 16.4410 8500 2.8671 1.9290 0.7177
2.9448 16.6344 8600 2.8189 1.9349 0.7116
2.9448 16.8279 8700 2.7691 1.9365 0.7078
2.9448 17.0213 8800 2.7317 1.9420 0.7069
2.9448 17.2147 8900 2.6832 1.9490 0.7056
2.6749 17.4081 9000 2.6420 1.9784 0.7020
2.6749 17.6015 9100 2.6020 1.9415 0.6991
2.6749 17.7950 9200 2.5667 1.9762 0.6995
2.6749 17.9884 9300 2.5171 1.9857 0.6771
2.6749 18.1818 9400 2.4922 1.9890 0.6775
2.4473 18.3752 9500 2.4455 1.9883 0.6683
2.4473 18.5687 9600 2.4192 1.9815 0.6621
2.4473 18.7621 9700 2.3866 1.9905 0.6523
2.4473 18.9555 9800 2.3354 1.9914 0.6539
2.4473 19.1489 9900 2.3114 1.9925 0.6516
2.2307 19.3424 10000 2.2695 1.9903 0.6454
2.2307 19.5358 10100 2.2466 1.9925 0.6464
2.2307 19.7292 10200 2.2167 1.9929 0.6423
2.2307 19.9226 10300 2.1762 1.9914 0.6413

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
4
Safetensors
Model size
96M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for utakumi/Hubert-common_voice_JSUT-ja-demo-japanese

Finetuned
(21)
this model