Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
3
Jiarui Yao
FlippyDora
Follow
0 followers
·
9 following
AI & ML interests
None yet
Recent Activity
updated
a model
about 9 hours ago
ScaleML-RLHF/Qwen2.5-Math-7B-raftpp-em-n8-8-iter10
published
a model
about 9 hours ago
ScaleML-RLHF/Qwen2.5-Math-7B-raftpp-em-n8-8-iter10
upvoted
a
paper
about 10 hours ago
OTC: Optimal Tool Calls via Reinforcement Learning
View all activity
Organizations
FlippyDora
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
updated
a model
about 9 hours ago
ScaleML-RLHF/Qwen2.5-Math-7B-raftpp-em-n8-8-iter10
Updated
about 9 hours ago
•
4
published
a model
about 9 hours ago
ScaleML-RLHF/Qwen2.5-Math-7B-raftpp-em-n8-8-iter10
Updated
about 9 hours ago
•
4
upvoted
2 papers
about 10 hours ago
OTC: Optimal Tool Calls via Reinforcement Learning
Paper
•
2504.14870
•
Published
1 day ago
•
18
ToolRL: Reward is All Tool Learning Needs
Paper
•
2504.13958
•
Published
6 days ago
•
21
updated
a model
about 14 hours ago
ScaleML-RLHF/Qwen2.5-Math-7B-raftpp-em-n8-8-iter9
Updated
about 14 hours ago
•
5
published
a model
about 14 hours ago
ScaleML-RLHF/Qwen2.5-Math-7B-raftpp-em-n8-8-iter9
Updated
about 14 hours ago
•
5
updated
3 models
about 22 hours ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter15
Updated
about 22 hours ago
•
5
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter14
Updated
about 22 hours ago
•
5
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter13
Updated
about 22 hours ago
•
5
published
3 models
about 22 hours ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter15
Updated
about 22 hours ago
•
5
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter14
Updated
about 22 hours ago
•
5
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter13
Updated
about 22 hours ago
•
5
updated
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter12
Updated
1 day ago
•
6
published
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter12
Updated
1 day ago
•
6
updated
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter11
Updated
1 day ago
•
8
published
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter11
Updated
1 day ago
•
8
updated
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter10
Updated
1 day ago
•
5
published
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter10
Updated
1 day ago
•
5
updated
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter9
Updated
1 day ago
•
7
published
a model
1 day ago
ScaleML-RLHF/Qwen2.5-Math-1.5B-grpo-em-n8-8-iter9
Updated
1 day ago
•
7
Load more