MeloTTS Model Checkpoint

This repository contains trained model checkpoints for MeloTTS, a high-quality multi-lingual text-to-speech system. These checkpoints are part of a trained model that can be used for text-to-speech synthesis.

Model Details

  • Model Type: MeloTTS
  • Language Support: English (Default)
  • Sampling Rate: 44.1kHz
  • Mel Channels: 128
  • Hidden Channels: 192
  • Filter Channels: 768

Architecture Details

  • Inter channels: 192
  • Number of heads: 2
  • Number of layers: 6
  • Flow layers: 3
  • Kernel size: 3
  • Dropout rate: 0.1

Training Dataset

This model was trained on the Jenny TTS Dataset, which is available on Hugging Face. The dataset consists of high-quality English speech recordings suitable for text-to-speech training.

Model Files

The repository contains several checkpoint files:

  • DUR_*.pth: Duration predictor checkpoints
  • G_*.pth: Generator model checkpoints
  • D_*.pth: Discriminator model checkpoints
  • config.json: Model configuration file

Usage

To use this model with MeloTTS:

from melo.api import TTS

# Initialize TTS with the model path
tts = TTS(model_path="kadirnar/melotts-model")

# Generate speech
tts.tts_to_file(
    text="Your text here",
    speaker="EN-default",
    language="EN",
    output_path="output.wav"
)

Training Details

The model was trained with the following specifications:

  • Batch size: 6
  • Learning rate: 0.0003
  • Beta values: [0.8, 0.99]
  • Segment size: 16384

Original Repository

This model is based on MeloTTS by MyShell.ai. Visit the original repository for more details about the architecture and implementation.

License

This model follows the same licensing as the original MeloTTS repository (MIT License).

Downloads last month
12
Inference API
Unable to determine this model's library. Check the docs .

Model tree for kadirnar/melotts-jenny

Finetuned
(1)
this model

Dataset used to train kadirnar/melotts-jenny