Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 8 days ago • 98
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning Paper • 2504.07128 • Published 15 days ago • 73
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks Paper • 2504.05118 • Published 10 days ago • 24
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published 16 days ago • 37
Effectively Controlling Reasoning Models through Thinking Intervention Paper • 2503.24370 • Published 16 days ago • 18
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 16 days ago • 52
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation Paper • 2503.19693 • Published 23 days ago • 75
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation Paper • 2503.22675 • Published 19 days ago • 34
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published 16 days ago • 61
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking Paper • 2503.19855 • Published 22 days ago • 25
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published 23 days ago • 17
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published 27 days ago • 68
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 29 days ago • 117
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills Paper • 2503.12533 • Published Mar 16 • 63
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization Paper • 2503.12937 • Published about 1 month ago • 27