methods that align llm with human preference
-
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 24 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 47 -
A General Theoretical Paradigm to Understand Learning from Human Preferences
Paper • 2310.12036 • Published • 13 -
Deep Reinforcement Learning from Hierarchical Weak Preference Feedback
Paper • 2309.02632 • Published • 1