Hubert_noisy_common_voice_phonemes_debug

This model is a fine-tuned version of rinna/japanese-hubert-base on the ORIGINAL_NOISY_COMMON_VOICE - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9125
  • Wer: 1.0222
  • Cer: 0.3103

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12500
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
No log 0.2660 100 12.2968 1.0717 1.0679
No log 0.5319 200 5.9423 1.0 0.9813
No log 0.7979 300 5.4201 1.0 0.9813
No log 1.0638 400 4.9840 1.0 0.9813
6.4953 1.3298 500 4.4992 1.0 0.9813
6.4953 1.5957 600 4.0212 1.0 0.9813
6.4953 1.8617 700 3.5973 1.0 0.9813
6.4953 2.1277 800 3.2996 1.0 0.9813
6.4953 2.3936 900 3.1686 1.0 0.9813
3.442 2.6596 1000 3.0725 1.0 0.9813
3.442 2.9255 1100 2.9187 1.0 0.9813
3.442 3.1915 1200 2.6014 1.0 0.8917
3.442 3.4574 1300 2.1700 1.0 0.6548
3.442 3.7234 1400 1.7176 1.0 0.4492
2.3862 3.9894 1500 1.5003 1.0 0.4197
2.3862 4.2553 1600 1.3507 1.0 0.4027
2.3862 4.5213 1700 1.2036 1.0 0.3701
2.3862 4.7872 1800 1.0972 1.0 0.3432
2.3862 5.0532 1900 0.9528 1.0 0.3108
1.2375 5.3191 2000 0.8881 1.0 0.2965
1.2375 5.5851 2100 0.8716 1.0 0.3024
1.2375 5.8511 2200 0.8150 1.0 0.2904
1.2375 6.1170 2300 0.7949 1.0 0.2875
1.2375 6.3830 2400 0.7734 1.0 0.2879
0.8538 6.6489 2500 0.7513 1.0 0.2839
0.8538 6.9149 2600 0.7448 1.0 0.2822
0.8538 7.1809 2700 0.7400 1.0 0.2804
0.8538 7.4468 2800 0.7283 1.0 0.2786
0.8538 7.7128 2900 0.7322 1.0 0.2809
0.7165 7.9787 3000 0.7111 1.0 0.2784
0.7165 8.2447 3100 0.7282 1.0 0.2858
0.7165 8.5106 3200 0.6960 1.0 0.2750
0.7165 8.7766 3300 0.7104 1.0 0.2811
0.7165 9.0426 3400 0.7289 1.0006 0.2790
0.6393 9.3085 3500 0.7068 1.0 0.2787
0.6393 9.5745 3600 0.7173 0.9999 0.2768
0.6393 9.8404 3700 0.6848 0.9963 0.2711
0.6393 10.1064 3800 0.7057 0.9954 0.2792
0.6393 10.3723 3900 0.7190 0.9975 0.2792
0.5993 10.6383 4000 0.7214 0.9946 0.2779
0.5993 10.9043 4100 0.7275 0.9931 0.2832
0.5993 11.1702 4200 0.6970 0.9902 0.2744
0.5993 11.4362 4300 0.7212 0.9946 0.2723
0.5993 11.7021 4400 0.7260 0.9915 0.2751
0.5646 11.9681 4500 0.7185 1.0111 0.2737
0.5646 12.2340 4600 0.7415 0.9968 0.2833
0.5646 12.5 4700 0.7404 0.9908 0.2779
0.5646 12.7660 4800 0.7145 0.9885 0.2727
0.5646 13.0319 4900 0.7319 1.0011 0.2719
0.5215 13.2979 5000 0.7503 0.9994 0.2726
0.5215 13.5638 5100 0.7200 1.0067 0.2710
0.5215 13.8298 5200 0.7043 0.9895 0.2746
0.5215 14.0957 5300 0.7587 1.0130 0.2760
0.5215 14.3617 5400 0.7453 0.9886 0.2792
0.4978 14.6277 5500 0.7269 1.0015 0.2754
0.4978 14.8936 5600 0.7381 0.9986 0.2728
0.4978 15.1596 5700 0.7658 1.0445 0.2747
0.4978 15.4255 5800 0.7593 1.0165 0.2758
0.4978 15.6915 5900 0.7959 1.0401 0.2799
0.4807 15.9574 6000 0.7533 1.0161 0.2784
0.4807 16.2234 6100 0.7566 0.9879 0.2775
0.4807 16.4894 6200 0.7418 0.9918 0.2784
0.4807 16.7553 6300 0.7968 0.9957 0.2811
0.4807 17.0213 6400 0.7728 1.0132 0.2754
0.4456 17.2872 6500 0.8130 1.0176 0.2794
0.4456 17.5532 6600 0.8082 1.0552 0.2850
0.4456 17.8191 6700 0.8325 1.0939 0.2797
0.4456 18.0851 6800 0.8033 0.9931 0.2804
0.4456 18.3511 6900 0.7595 1.0057 0.2801
0.4396 18.6170 7000 0.7648 1.0057 0.2816
0.4396 18.8830 7100 0.7651 0.9965 0.2818
0.4396 19.1489 7200 0.7942 1.0526 0.2821
0.4396 19.4149 7300 0.7584 1.0329 0.2865
0.4396 19.6809 7400 0.7743 1.0247 0.2839
0.4402 19.9468 7500 0.7724 0.9974 0.2782
0.4402 20.2128 7600 0.8211 1.0083 0.2819
0.4402 20.4787 7700 0.7944 0.9985 0.2845
0.4402 20.7447 7800 0.8000 1.0283 0.2809
0.4402 21.0106 7900 0.7961 1.0393 0.2848
0.4161 21.2766 8000 0.8153 1.0126 0.2868
0.4161 21.5426 8100 0.7890 1.0290 0.2848
0.4161 21.8085 8200 0.8137 0.9949 0.2876
0.4161 22.0745 8300 0.8160 1.0130 0.2883
0.4161 22.3404 8400 0.8261 0.9967 0.2843
0.4122 22.6064 8500 0.8360 1.0004 0.2872
0.4122 22.8723 8600 0.7974 0.9870 0.2845
0.4122 23.1383 8700 0.8509 1.0251 0.2959
0.4122 23.4043 8800 0.8392 1.0060 0.2996
0.4122 23.6702 8900 0.8572 1.0025 0.2960
0.4233 23.9362 9000 0.8738 1.0243 0.2959
0.4233 24.2021 9100 0.8740 1.0279 0.2897
0.4233 24.4681 9200 0.8348 1.0178 0.2910
0.4233 24.7340 9300 0.8519 1.0287 0.2965
0.4233 25.0 9400 0.8510 0.9975 0.3038
0.4072 25.2660 9500 0.8886 1.0440 0.2998
0.4072 25.5319 9600 0.9135 0.9960 0.3032
0.4072 25.7979 9700 0.8631 1.0018 0.3065
0.4072 26.0638 9800 0.8652 1.0216 0.2992
0.4072 26.3298 9900 0.8664 1.0366 0.2960
0.4149 26.5957 10000 0.8856 1.0248 0.3047
0.4149 26.8617 10100 0.8662 1.0223 0.2998
0.4149 27.1277 10200 0.9195 0.9953 0.3116
0.4149 27.3936 10300 0.9434 1.0148 0.3118
0.4149 27.6596 10400 0.8643 1.0126 0.3096
0.4264 27.9255 10500 0.9074 1.0062 0.3078
0.4264 28.1915 10600 0.8856 1.0497 0.3035
0.4264 28.4574 10700 0.8924 1.0676 0.3032
0.4264 28.7234 10800 0.9018 1.0203 0.3002
0.4264 28.9894 10900 0.9206 1.0573 0.3049
0.4091 29.2553 11000 0.8745 1.0294 0.3033
0.4091 29.5213 11100 0.8626 0.9920 0.3053
0.4091 29.7872 11200 0.9597 1.0218 0.3129

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
0
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for utakumi/Hubert_noisy_common_voice_phonemes_debug

Finetuned
(24)
this model