metadata

library_name: transformers
language:
  - ja
license: apache-2.0
base_model: rinna/japanese-hubert-base
tags:
  - automatic-speech-recognition
  - mozilla-foundation/common_voice_13_0
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: Hubert-common_voice-phoneme-debug-warmup500
    results: []

Hubert-common_voice-phoneme-debug-warmup500

This model is a fine-tuned version of rinna/japanese-hubert-base on the MOZILLA-FOUNDATION/COMMON_VOICE_13_0 - JA dataset. It achieves the following results on the evaluation set:

Loss: 2.9679
Wer: 1.0
Cer: 0.9851

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 500
num_epochs: 30.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
No log	0.7092	100	4.5669	1.0	0.9851
No log	1.4184	200	3.0119	1.0	0.9851
No log	2.1277	300	2.9840	1.0	0.9851
No log	2.8369	400	2.9764	1.0	0.9851
3.973	3.5461	500	2.9796	1.0	0.9851
3.973	4.2553	600	2.9758	1.0	0.9851
3.973	4.9645	700	2.9691	1.0	0.9851
3.973	5.6738	800	2.9858	1.0	0.9850
3.973	6.3830	900	2.9692	1.0	0.9851
2.9654	7.0922	1000	2.9895	1.0	0.9850
2.9654	7.8014	1100	2.9725	1.0	0.9850
2.9654	8.5106	1200	2.9713	1.0	0.9850
2.9654	9.2199	1300	2.9758	1.0	0.9851
2.9654	9.9291	1400	2.9784	1.0	0.9850
2.9643	10.6383	1500	2.9687	1.0	0.9851
2.9643	11.3475	1600	2.9779	1.0	0.9851
2.9643	12.0567	1700	2.9679	1.0	0.9850
2.9643	12.7660	1800	2.9769	1.0	0.9851
2.9643	13.4752	1900	2.9718	1.0	0.9851
2.9631	14.1844	2000	2.9686	1.0	0.9851
2.9631	14.8936	2100	2.9706	1.0	0.9850
2.9631	15.6028	2200	2.9791	1.0	0.9851
2.9631	16.3121	2300	2.9731	1.0	0.9851
2.9631	17.0213	2400	2.9722	1.0	0.9850
2.9627	17.7305	2500	2.9723	1.0	0.9851
2.9627	18.4397	2600	2.9689	1.0	0.9851
2.9627	19.1489	2700	2.9747	1.0	0.9851
2.9627	19.8582	2800	2.9801	1.0	0.9851
2.9627	20.5674	2900	2.9740	1.0	0.9851
2.9622	21.2766	3000	2.9736	1.0	0.9850
2.9622	21.9858	3100	2.9719	1.0	0.9851
2.9622	22.6950	3200	2.9710	1.0	0.9850
2.9622	23.4043	3300	2.9714	1.0	0.9850
2.9622	24.1135	3400	2.9701	1.0	0.9851
2.9609	24.8227	3500	2.9695	1.0	0.9851
2.9609	25.5319	3600	2.9669	1.0	0.9850
2.9609	26.2411	3700	2.9774	1.0	0.9851
2.9609	26.9504	3800	2.9712	1.0	0.9851
2.9609	27.6596	3900	2.9701	1.0	0.9851
2.962	28.3688	4000	2.9689	1.0	0.9851
2.962	29.0780	4100	2.9738	1.0	0.9850
2.962	29.7872	4200	2.9678	1.0	0.9851

Framework versions

Transformers 4.47.0.dev0
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.20.3