World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning Paper • 2503.10480 • Published about 15 hours ago • 22
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training Paper • 2503.08525 • Published 3 days ago • 13
Query of CC: Unearthing Large Scale Domain-Specific Knowledge from Public Corpora Paper • 2401.14624 • Published Jan 26, 2024 • 1