The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 21 days ago • 89
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 25 days ago • 86
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 26 days ago • 252
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 26 days ago • 90
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published Jan 2 • 48
Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published Dec 19, 2024 • 73
What indeed can GPT models do in chemistry? A comprehensive benchmark on eight tasks Paper • 2305.18365 • Published May 27, 2023 • 4
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 42
Power-LM Collection Dense & MoE LLMs trained with power learning rate scheduler. • 4 items • Updated Oct 17, 2024 • 15
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 64
MURI: High-Quality Instruction Tuning Datasets for Low-Resource Languages via Reverse Instructions Paper • 2409.12958 • Published Sep 19, 2024 • 8
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Paper • 2409.17115 • Published Sep 25, 2024 • 61
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization Paper • 2409.12903 • Published Sep 19, 2024 • 22
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 136
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler Paper • 2408.13359 • Published Aug 23, 2024 • 24
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations Paper • 2403.09704 • Published Mar 8, 2024 • 31