Submitted by nebulae09 25 Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM · 12 authors 1
Submitted by carboncoo 17 DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding · 8 authors 1
Submitted by ZhaoyangLyu 12 Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation · 12 authors 1
Submitted by akhaliq 7 DAPO: An Open-Source LLM Reinforcement Learning System at Scale · 35 authors 1
Submitted by Lingaaaaaaa 4 Temporal Consistency for LLM Reasoning Process Error Identification · 7 authors 1
Submitted by kpzhang996 4 MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification · 9 authors 1
Submitted by cckevinn 4 CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era · 10 authors 1
Submitted by akhaliq 2 Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control · 39 authors 1
Submitted by kpzhang996 2 PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models · 11 authors 1
Submitted by jacklishufan 1 Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection · 7 authors 1
Submitted by Mingtongz 1 KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation · 3 authors 1
Submitted by yuwendu 1 RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation · 9 authors 1