A collection of LLM-related papers by Purdue researchers. Welcome to add your own.
-
Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control
Paper • 2604.26326 • Published • 11 -
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Paper • 2406.16306 • Published • 1 -
DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning
Paper • 2510.02341 • Published • 4 -
More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Paper • 2504.02193 • Published • 1