LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis Paper • 2503.21749 • Published 22 days ago • 25
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models Paper • 2503.13939 • Published Mar 18 • 4
CLS-RL: Image Classification with Rule-Based Reinforcement Learning Paper • 2503.16188 • Published 29 days ago • 9
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper • 2501.13920 • Published Jan 23 • 17
Unleashing the Potentials of Likelihood Composition for Multi-modal Language Models Paper • 2410.00363 • Published Oct 1, 2024 • 1
Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models Paper • 2312.06685 • Published Dec 9, 2023 • 1
Boosting Open-Domain Continual Learning via Leveraging Intra-domain Category-aware Prototype Paper • 2408.09984 • Published Aug 19, 2024 • 1
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions Paper • 2409.15278 • Published Sep 23, 2024 • 26
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining Paper • 2408.02657 • Published Aug 5, 2024 • 36
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models Paper • 2402.05935 • Published Feb 8, 2024 • 17