HelpingAI-TTS-v1 π€π₯
Yo, what's good! Welcome to HelpingAI-TTS-v1, your go-to for next-level Text-to-Speech (TTS) that's all about personalization, vibes, and clarity. Whether you want your text to sound cheerful, emotional, or just like you're chatting with a friend, this model's got you covered. π―
π Whatβs HelpingAI-TTS-v1?
HelpingAI-TTS-v1 is a beast when it comes to generating high-quality, customizable speech. It doesnβt just spit out generic text; it feels what you're saying and brings it to life with style. Add a description to your speech, like how fast or slow it should be, if itβs cheerful or serious, and BOOM β you got yourself the perfect audio output. π§
π οΈ How It Works: A Quick Rundown π₯
- Transcript: The text you want to speak. Keep it casual, formal, or whatever suits your vibe.
- Caption: Describes how you want the speech to sound. Want a fast-paced, hype vibe or a calm, soothing tone? Just say it. π₯
π‘ Features Youβll Love:
- Expressive Speech: This isnβt just any TTS. You can describe the tone, speed, and vibe you want. Whether it's a peppy "Hey!" or a chill "What's up?", this modelβs got your back.
- Top-Notch Quality: Super clean audio. No static. Just pure, high-quality sound that makes your words pop.
- Customizable Like Never Before: Play with emotions, tone, and even accents. Itβs all about making it personal. π
π§ Get Started: Installation π₯
Ready to vibe? Hereβs how you set up HelpingAI-TTS-v1 in seconds:
pip install git+https://github.com/huggingface/parler-tts.git
π₯οΈ Usage: Let's Make Some Magic π€
Hereβs the code that gets the job done. Super simple to use, just plug in your text and describe how you want it to sound. Itβs like setting the mood for a movie.
import torch
from parler_tts import ParlerTTSForConditionalGeneration
from transformers import AutoTokenizer
import soundfile as sf
# Choose your device (GPU or CPU)
device = "cuda:0" if torch.cuda.is_available() else "cpu"
# Load the model and tokenizers
model = ParlerTTSForConditionalGeneration.from_pretrained("HelpingAI/HelpingAI-TTS-v1").to(device)
tokenizer = AutoTokenizer.from_pretrained("HelpingAI/HelpingAI-TTS-v1")
description_tokenizer = AutoTokenizer.from_pretrained(model.config.text_encoder._name_or_path)
# Customize your inputs: text + description
prompt = "Hey, what's up? Howβs it going?"
description = "A friendly, upbeat, and casual tone with a moderate speed. Speaker sounds confident and relaxed."
# Tokenize the inputs
input_ids = description_tokenizer(description, return_tensors="pt").input_ids.to(device)
prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
# Generate the audio
generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids)
audio_arr = generation.cpu().numpy().squeeze()
# Save the audio to a file
sf.write("output.wav", audio_arr, model.config.sampling_rate)
This will create a super clean .wav
file with the speech you asked for. π₯
π Language Support: Speak Your Language
No matter where you're from, HelpingAI-TTS-v1 has you covered. Officially supporting 20+ languages and unofficial support for a few more. Thatβs global vibes right there. π
- Assamese
- Bengali
- Bodo
- Dogri
- Kannada
- Malayalam
- Marathi
- Sanskrit
- Nepali
- English
- Telugu
- Hindi
- Gujarati
- Konkani
- Maithili
- Manipuri
- Odia
- Santali
- Sindhi
- Tamil
- Urdu
- Chhattisgarhi
- Kashmiri
- Punjabi
Powered by HelpingAI, where we blend emotional intelligence with tech. π
- Downloads last month
- 2,984
Model tree for HelpingAI/HelpingAI-TTS-v1
Base model
ai4bharat/indic-parler-tts