LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning? Paper • 2503.19990 • Published 2 days ago • 24
RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints Paper • 2503.16408 • Published 7 days ago • 36
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper • 2503.14478 • Published 9 days ago • 42
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper • 2503.14478 • Published 9 days ago • 42
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper • 2503.14478 • Published 9 days ago • 42 • 2
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published 17 days ago • 54
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 148
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published 30 days ago • 70
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published 30 days ago • 70
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement Paper • 2501.12273 • Published Jan 21 • 14
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published Jan 1 • 105