cuisijia
's Collections
Natural Language Reinforcement Learning
Paper
•
2411.14251
•
Published
•
31
Towards General-Purpose Model-Free Reinforcement Learning
Paper
•
2501.16142
•
Published
•
31
Reinforcement Learning for Reasoning in Small LLMs: What Works and What
Doesn't
Paper
•
2503.16219
•
Published
•
51
Teaching Large Language Models to Reason with Reinforcement Learning
Paper
•
2403.04642
•
Published
•
51
Large Language Model Agent: A Survey on Methodology, Applications and
Challenges
Paper
•
2503.21460
•
Published
•
79
A Survey of Efficient Reasoning for Large Reasoning Models: Language,
Multimodality, and Beyond
Paper
•
2503.21614
•
Published
•
42
Exploring Data Scaling Trends and Effects in Reinforcement Learning from
Human Feedback
Paper
•
2503.22230
•
Published
•
46
Efficient Inference for Large Reasoning Models: A Survey
Paper
•
2503.23077
•
Published
•
47
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large
Language Models
Paper
•
2503.24235
•
Published
•
55
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies
Ahead
Paper
•
2504.00294
•
Published
•
11
Inference-Time Scaling for Generalist Reward Modeling
Paper
•
2504.02495
•
Published
•
57
Advances and Challenges in Foundation Agents: From Brain-Inspired
Intelligence to Evolutionary, Collaborative, and Safe Systems
Paper
•
2504.01990
•
Published
•
302