metadata

library_name: transformers
license: mit
base_model: microsoft/speecht5_tts
tags:
  - generated_from_trainer
datasets:
  - m-aliabbas/common_voice_urdu1
model-index:
  - name: TTS urdu
    results: []

TTS urdu

This model is a fine-tuned version of microsoft/speecht5_tts on the common_voice_urdu1 dataset. It achieves the following results on the evaluation set:

Loss: 0.4753

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 300
training_steps: 10500
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.5698	4.3103	500	0.5020
0.528	8.6207	1000	0.4814
0.5092	12.9310	1500	0.4693
0.502	17.2414	2000	0.4720
0.4944	21.5517	2500	0.4665
0.4922	25.8621	3000	0.4635
0.4793	30.1724	3500	0.4653
0.4851	34.4828	4000	0.4684
0.4726	38.7931	4500	0.4651
0.4614	43.1034	5000	0.4660
0.4734	47.4138	5500	0.4652
0.4621	51.7241	6000	0.4688
0.4689	56.0345	6500	0.4730
0.4589	60.3448	7000	0.4663
0.4658	64.6552	7500	0.4725
0.4552	68.9655	8000	0.4742
0.4549	73.2759	8500	0.4763
0.4599	77.5862	9000	0.4726
0.4559	81.8966	9500	0.4738
0.4605	86.2069	10000	0.4764
0.4482	90.5172	10500	0.4753

Framework versions

Transformers 4.46.0.dev0
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.20.0