rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 263
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published Jan 8 • 50
DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization Paper • 2501.03271 • Published Jan 5 • 11
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais and 2 others • Nov 13, 2024 • 100
Teaching Large Language Models to Reason with Reinforcement Learning Paper • 2403.04642 • Published Mar 7, 2024 • 48
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper • 2402.17177 • Published Feb 27, 2024 • 88
Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals Paper • 2308.02510 • Published Jul 27, 2023 • 22
The Hydra Effect: Emergent Self-repair in Language Model Computations Paper • 2307.15771 • Published Jul 28, 2023 • 19