VITS model Text to Speech Russian
The text accepts lowercase
Example Text to Speech
from transformers import VitsModel, AutoTokenizer
import torch
import scipy
model = VitsModel.from_pretrained("joefox/tts_vits_ru_hf")
tokenizer = AutoTokenizer.from_pretrained("joefox/tts_vits_ru_hf")
text = "Привет, как дел+а? Всё +очень хорош+о! А у тебя как?"
text = text.lower()
inputs = tokenizer(text, return_tensors="pt")
inputs['speaker_id'] = 3
with torch.no_grad():
output = model(**inputs).waveform
scipy.io.wavfile.write("techno.wav", rate=model.config.sampling_rate, data=output[0].cpu().numpy())
For displayed in a Jupyter Notebook / Google Colab:
from IPython.display import Audio
Audio(output, rate=model.config.sampling_rate)
Languages covered
Russian (ru_RU)
- Downloads last month
- 432
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.