Chat with a Qwen AI assistant
An end-to-end (e2e) Voice Language Model by Fish Audio.
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Realtime implementation of Whisper large turbo
Generate image descriptions from text prompts
Detect objects in images and get bounding boxes
GPT 4o like bot.