dgo-tts-training-data-b-speecht5

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 4000
training_steps: 40000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.656	5.3763	1000	0.6172
0.5786	10.7527	2000	0.5560
0.55	16.1290	3000	0.5385
0.5296	21.5054	4000	0.5166
0.5266	26.8817	5000	0.5088
0.5067	32.2581	6000	0.5067
0.4973	37.6344	7000	0.4968
0.4833	43.0108	8000	0.4990
0.4802	48.3871	9000	0.5007
0.4693	53.7634	10000	0.4955
0.4576	59.1398	11000	0.4942
0.4509	64.5161	12000	0.4891
0.4474	69.8925	13000	0.4947
0.4335	75.2688	14000	0.4943
0.433	80.6452	15000	0.4900
0.4254	86.0215	16000	0.4923
0.4282	91.3978	17000	0.4931
0.4153	96.7742	18000	0.4946
0.4154	102.1505	19000	0.4946
0.4221	107.5269	20000	0.4954
0.4209	112.9032	21000	0.4940
0.4199	118.2796	22000	0.4951
0.4168	123.6559	23000	0.4950
0.4169	129.0323	24000	0.4950
0.423	134.4086	25000	0.4971
0.4131	139.7849	26000	0.4965
0.4165	145.1613	27000	0.4944
0.4174	150.5376	28000	0.4954
0.4137	155.9140	29000	0.4960
0.4183	161.2903	30000	0.4981
0.416	166.6667	31000	0.4960
0.4077	172.0430	32000	0.4977
0.4094	177.4194	33000	0.4963
0.4131	182.7957	34000	0.4966
0.4106	188.1720	35000	0.4955
0.4095	193.5484	36000	0.4982
0.4087	198.9247	37000	0.4975
0.4088	204.3011	38000	0.4967
0.4085	209.6774	39000	0.4976
0.407	215.0538	40000	0.4971

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

(1263)

this model