LLM-Reward - a Trangle Collection

Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Trangle 's Collections

RLHF

LLM-APP-Recommendation

LLM-Reward

updated Jun 7

Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms

Paper • 2406.02900 • Published Jun 5 • 11

Collection guide
Browse collections

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs