arxiv:2412.17256
WeihaoZeng
AndrewZeng
AI & ML interests
None yet
Recent Activity
authored
a paper
21 days ago
B-STaR: Monitoring and Balancing Exploration and Exploitation in
Self-Taught Reasoners
upvoted
a
paper
26 days ago
Search-o1: Agentic Search-Enhanced Large Reasoning Models
upvoted
a
paper
28 days ago
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Organizations
models
3
datasets
60
AndrewZeng/math-bstar-sample
Viewer
•
Updated
•
11.5k
•
13
AndrewZeng/bstar-math-dev
Viewer
•
Updated
•
604
•
41
AndrewZeng/prm-reward-data
Viewer
•
Updated
•
240k
•
41
AndrewZeng/math-trn-format
Viewer
•
Updated
•
11.5k
•
66
AndrewZeng/math_scaling
Viewer
•
Updated
•
100
•
27
AndrewZeng/random_syn
Viewer
•
Updated
•
108k
•
26
AndrewZeng/medium_syn_mistral_20w_mistral_infer_part_4
Viewer
•
Updated
•
38.9k
•
35
AndrewZeng/medium_syn_mistral_20w_mistral_infer_part_3
Viewer
•
Updated
•
38.9k
•
37
AndrewZeng/medium_syn_mistral_20w_mistral_infer_part_2
Viewer
•
Updated
•
38.9k
•
33
AndrewZeng/medium_syn_mistral_20w_mistral_infer_part_1
Viewer
•
Updated
•
38.9k
•
33