Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 1 day ago • 42
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data Paper • 2603.09206 • Published 2 days ago • 36
Streaming Autoregressive Video Generation via Diagonal Distillation Paper • 2603.09488 • Published 1 day ago • 5
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion Paper • 2603.06577 • Published 5 days ago • 37
HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising Paper • 2603.08703 • Published 2 days ago • 26
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published 3 days ago • 72
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 5 days ago • 96
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published 6 days ago • 25
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model Paper • 2603.05438 • Published 6 days ago • 33
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios Paper • 2602.23166 • Published 13 days ago • 39
RealWonder: Real-Time Physical Action-Conditioned Video Generation Paper • 2603.05449 • Published 6 days ago • 11
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier Paper • 2603.03756 • Published 8 days ago • 85
Proact-VL: A Proactive VideoLLM for Real-Time AI Companions Paper • 2603.03447 • Published 8 days ago • 31