MMS-TTS Guarani Model

This is a VITS-based text-to-speech model for the Guarani language, based on the MMS-TTS architecture.

Model Description

This model can generate speech from Guarani text input using the VITS architecture.

Usage

from transformers import VitsModel, AutoTokenizer
import torch

model = VitsModel.from_pretrained("joselobenitezg/mms-grn-tts")
tokenizer = AutoTokenizer.from_pretrained("joselobenitezg/mms-grn-tts")

text = "some example text in the Guarani language"
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    output = model(**inputs).waveform

# Save the output as a wav file
import scipy
scipy.io.wavfile.write("output.wav", rate=model.config.sampling_rate, data=output)