MeloTTS Model Checkpoint

This repository contains trained model checkpoints for MeloTTS, a high-quality multi-lingual text-to-speech system. These checkpoints are part of a trained model that can be used for text-to-speech synthesis.

Model Details

Model Type: MeloTTS
Language Support: English (Default)
Sampling Rate: 44.1kHz
Mel Channels: 128
Hidden Channels: 192
Filter Channels: 768

Architecture Details

Inter channels: 192
Number of heads: 2
Number of layers: 6
Flow layers: 3
Kernel size: 3
Dropout rate: 0.1

Training Dataset

This model was trained on the Jenny TTS Dataset, which is available on Hugging Face. The dataset consists of high-quality English speech recordings suitable for text-to-speech training.

Model Files

The repository contains several checkpoint files:

DUR_*.pth: Duration predictor checkpoints
G_*.pth: Generator model checkpoints
D_*.pth: Discriminator model checkpoints
config.json: Model configuration file

Usage

To use this model with MeloTTS:

from melo.api import TTS

# Initialize TTS with the model path
tts = TTS(model_path="kadirnar/melotts-model")

# Generate speech
tts.tts_to_file(
    text="Your text here",
    speaker="EN-default",
    language="EN",
    output_path="output.wav"
)

Training Details

The model was trained with the following specifications:

Batch size: 6
Learning rate: 0.0003
Beta values: [0.8, 0.99]
Segment size: 16384

Original Repository

This model is based on MeloTTS by MyShell.ai. Visit the original repository for more details about the architecture and implementation.

License

This model follows the same licensing as the original MeloTTS repository (MIT License).

kadirnar
/

melotts-jenny