Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
trl-lib
's Collections
Preference datasets
Stepwise supervision datasets
Prompt-completion datasets
Prompt-only datasets
Unpaired preference datasets
Comparing DPO with IPO and KTO
Online-DPO
Stepwise supervision datasets
updated
17 days ago
Upvote
-
trl-lib/math_shepherd
Viewer
•
Updated
17 days ago
•
445k
•
584
•
2
trl-lib/prm800k
Viewer
•
Updated
17 days ago
•
41.2k
•
144
•
1
Upvote
-
Share collection
View history
Collection guide
Browse collections