SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 5 days ago • 140
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning Paper • 2502.01100 • Published 6 days ago • 14
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published 6 days ago • 34
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper • 2502.02492 • Published 5 days ago • 46
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Paper • 2501.18837 • Published 10 days ago • 8
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published 10 days ago • 17
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published 18 days ago • 55
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 18 days ago • 79
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 18 days ago • 305
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published 23 days ago • 43
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 26 days ago • 273
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 26 days ago • 53
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 27 days ago • 89