Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 3 days ago • 434
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published Feb 27 • 28