Overview

This indonesian finetune of F5-TTS is made to introduce indonesian speech capabilities on the model.

Dataset

Length: 43.35 hours
Audio samples: 43999

Dataset sources:
• data-indsp-news-lvcsr

Results

The model has some difficulties in accurately matching the zero shot voice and emotions. The model also hallucinates on long texts.

Reference text: "Tidak ada yang menakutiku, bahkan kematian sekalipun."
Reference audio: Zilong.ogg
Input text: "Halo. Model faintun ini adalah sebuah percobaan. Masih terdapat beberapa kekurangan jadi tolong dimaklumkan."
Generated audio: Zilong_generated.ogg

License

The pre-trained models are licensed under the CC-BY-NC license due to the training data Emilia, which is an in-the-wild dataset. Sorry for any inconvenience this may cause.


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Eempostor/F5-TTS-IND-FINETUNE

Base model

SWivid/F5-TTS
Finetuned
(25)
this model