Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models Paper • 2503.22165 • Published 9 days ago • 23
Interpreting Emergent Planning in Model-Free Reinforcement Learning Paper • 2504.01871 • Published 3 days ago • 10
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL Paper • 2503.23157 • Published 7 days ago • 5
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published 10 days ago • 30
Efficient Model Selection for Time Series Forecasting via LLMs Paper • 2504.02119 • Published 3 days ago • 10
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published 5 days ago • 121
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents Paper • 2504.00906 • Published 4 days ago • 18
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published 6 days ago • 16
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub Feb 12 • 55
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization Paper • 2503.12937 • Published 20 days ago • 27
Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs Paper • 2503.12303 • Published 21 days ago • 7
SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks Paper • 2503.15478 • Published 17 days ago • 9
STEVE: AStep Verification Pipeline for Computer-use Agent Training Paper • 2503.12532 • Published 20 days ago • 14
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published 19 days ago • 48
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published 16 days ago • 46