Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble Paper • 2401.16635 • Published Jan 30, 2024 • 1
Planning with Large Language Models for Code Generation Paper • 2303.05510 • Published Mar 9, 2023