MMS-TTS Guarani Model

This is a VITS-based text-to-speech model for the Guarani language, based on the MMS-TTS architecture.

Model Description

This model can generate speech from Guarani text input using the VITS architecture.

Usage

from transformers import VitsModel, AutoTokenizer
import torch

model = VitsModel.from_pretrained("joselobenitezg/mms-grn-tts")
tokenizer = AutoTokenizer.from_pretrained("joselobenitezg/mms-grn-tts")

text = "some example text in the Guarani language"
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    output = model(**inputs).waveform

# Save the output as a wav file
import scipy
scipy.io.wavfile.write("output.wav", rate=model.config.sampling_rate, data=output)
Downloads last month
135
Safetensors
Model size
36.3M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.