Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published 7 days ago • 35
Effectively Controlling Reasoning Models through Thinking Intervention Paper • 2503.24370 • Published 7 days ago • 18
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 8 days ago • 49
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation Paper • 2503.19693 • Published 14 days ago • 75
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation Paper • 2503.22675 • Published 10 days ago • 34
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published 7 days ago • 59
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking Paper • 2503.19855 • Published 13 days ago • 25
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published 14 days ago • 17
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published 18 days ago • 67
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 20 days ago • 115
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills Paper • 2503.12533 • Published 23 days ago • 63
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization Paper • 2503.12937 • Published 22 days ago • 27
Self-Taught Self-Correction for Small Language Models Paper • 2503.08681 • Published 27 days ago • 13
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published 27 days ago • 27
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Paper • 2503.10639 • Published 25 days ago • 48
Autoregressive Image Generation with Randomized Parallel Decoding Paper • 2503.10568 • Published 25 days ago • 8
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 27 days ago • 376
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published Mar 6 • 69
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published Mar 6 • 92