EvoClaw: Evaluating AI Agents on Continuous Software Evolution Paper • 2603.13428 • Published 4 days ago • 2
WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics Paper • 2603.13391 • Published 6 days ago • 10
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning Paper • 2603.15611 • Published about 14 hours ago • 1
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published about 14 hours ago • 70
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement Paper • 2603.12310 • Published 5 days ago • 7
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement Paper • 2603.12310 • Published 5 days ago • 7
Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents Paper • 2603.12634 • Published 4 days ago • 6
From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space Paper • 2603.12648 • Published 4 days ago • 10
daVinci-Env: Open SWE Environment Synthesis at Scale Paper • 2603.13023 • Published 4 days ago • 22 • 3
Meta-Reinforcement Learning with Self-Reflection for Agentic Search Paper • 2603.11327 • Published 5 days ago • 7
TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size Paper • 2603.07988 • Published 8 days ago • 2
Mobile-GS: Real-time Gaussian Splatting for Mobile Devices Paper • 2603.11531 • Published 5 days ago • 9