MarmaSpeakTTS: Text-to-Speech Model for Marma Language

This model provides text-to-speech synthesis for the Marma language (ISO code: rmz), a Tibeto-Burman language spoken by the Marma people in Bangladesh and Myanmar.

Model Details

Usage

This model can be used with the 🤗 Transformers library:

from transformers import VitsModel, AutoTokenizer, pipeline
import scipy.io.wavfile

# Load model and tokenizer
model = VitsModel.from_pretrained("CLEAR-Global/marmaspeak-tts-v1")
tokenizer = AutoTokenizer.from_pretrained("CLEAR-Global/marmaspeak-tts-v1")

# Create a pipeline
synthesizer = pipeline("text-to-speech", model=model, tokenizer=tokenizer)

# Synthesize text
text = "ကိုတော် ဇာမာ နီရေလည်း၊"  # Marma text example
output = synthesizer(text)

# Save to file
scipy.io.wavfile.write("output.wav", rate=16000, data=output["audio"][0])

Limitations and Biases

  • This is an early version of the model and may have limitations in pronunciation and naturalness.
  • The model works best with properly normalized Marma text.
  • Performance may vary based on the complexity and length of the input text.

Training

The model was fine-tuned from a Massively Multilingual Speech (MMS) VITS model using this training recipe.

Ethical Considerations

This model has been developed with permission and input from Marma language speakers. The voice synthesis should be used responsibly and respectfully.

Citation

@misc{marma-tts,
  author = {CLEAR Global},
  title = {MarmaSpeakTTS: A Text-to-Speech Model for Marma Language},
  year = {2025},
  howpublished = {https://huggingface.co/CLEAR-Global/marmaspeak-tts-v1}
}
Downloads last month
21
Safetensors
Model size
36.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train CLEAR-Global/marmaspeak-tts-v1

Space using CLEAR-Global/marmaspeak-tts-v1 1