rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 4 days ago • 178
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 8 days ago • 71
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 5 days ago • 64
Stream of Search (SoS): Learning to Search in Language Paper • 2404.03683 • Published Apr 1, 2024 • 29
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper • 2410.18693 • Published Oct 24, 2024 • 40
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper • 2410.02884 • Published Oct 3, 2024 • 53
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 225
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers Jul 30, 2024 • 61