SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 5 days ago • 79 • 5
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models Paper • 2408.15518 • Published Aug 28, 2024 • 43