Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 25 days ago • 89
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 25 days ago • 89
Running 258 258 Qwen2.5 Omni 7B Demo 🏆 Generate text and speech responses from text, images, or audio input
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 25 days ago • 89
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Paper • 2501.01957 • Published Jan 3 • 46