TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools Paper • 2503.10970 • Published 4 days ago • 12
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models Paper • 2503.11224 • Published 4 days ago • 23
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published 3 days ago • 90
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published 5 days ago • 23
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 5 days ago • 54
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published 5 days ago • 22
Shifting Long-Context LLMs Research from Input to Output Paper • 2503.04723 • Published 11 days ago • 19
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning Paper • 2503.10291 • Published 5 days ago • 30
Quantizing Large Language Models for Code Generation: A Differentiated Replication Paper • 2503.07103 • Published 8 days ago • 6
More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG Paper • 2503.04388 • Published 12 days ago • 15
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training Paper • 2503.08525 • Published 7 days ago • 14
Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol Paper • 2503.05860 • Published 10 days ago • 8
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published 7 days ago • 35
Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published 7 days ago • 18
Gemini Embedding: Generalizable Embeddings from Gemini Paper • 2503.07891 • Published 7 days ago • 30
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Paper • 2503.07536 • Published 7 days ago • 76