Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Reinforced Token Optimization
Activity Feed
Follow
4
AI & ML interests
None defined yet.
Team members
1
models
10
Sort: Recently updated
RTO-RL/Llama3-8B-RTO_RPP
Updated
Apr 10
•
10
•
1
RTO-RL/Llama3-8B-RPP
Updated
Apr 10
•
27
•
1
RTO-RL/Llama3-8B-TDPO
Updated
Feb 11
•
9
•
1
RTO-RL/Llama3-8B-SimPO
Updated
Feb 11
•
12
RTO-RL/Llama3-8B-RDPO
Updated
Feb 11
•
44
•
1
RTO-RL/Llama3-8B-PPO
Updated
Feb 11
•
17
•
1
RTO-RL/Llama3-8B-RTO
Updated
Feb 11
•
41
•
1
RTO-RL/Llama3.2-1B-RewardModel
Updated
Feb 11
•
61
RTO-RL/Llama3-8B-RewardModel
Updated
Feb 11
•
53
RTO-RL/Llama3-8B-DPO
Updated
Feb 11
•
9
datasets
0
None public yet