Light-R1 Collection Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond • 7 items • Updated 5 days ago • 10
SIFT: Grounding LLM Reasoning in Contexts via Stickers Paper • 2502.14922 • Published 26 days ago • 30
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 263