An end-to-end (e2e) Voice Language Model by Fish Audio.
Generate speech from text
Gradio demo of CharacterGen (SIGGRAPH 2024)
Upload images to try on clothes virtually
Segment objects in images and videos using text prompts
Explore and vote on 3D arenas in a leaderboard
A demo of MetaVoice 1B, a new TTS model by MetaVoice.