Collections
Discover the best community collections!
Collections trending this week
-
RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment
Paper • 2307.12950 • Published • 10 -
HumanLLMs/Human-Like-DPO-Dataset
Viewer • Updated • 10.9k • 3.02k • 192 -
sam-paech/gutenberg3-generalfiction-scifi-fantasy-romance-adventure-dpo
Viewer • Updated • 5.65k • 158 • 21 -
RLHFlow/Deepseek-PRM-Data
Viewer • Updated • 253k • 128 • 12