arxiv:2405.11143
Jian Hu
chuyi777
AI & ML interests
Reinforcement Learning
Recent Activity
upvoted
a
paper
15 days ago
ProcessBench: Identifying Process Errors in Mathematical Reasoning
updated
a model
25 days ago
OpenRLHF/Llama-3-8b-rm-mixture
updated
a model
25 days ago
OpenRLHF/Llama-2-7b-rm-anthropic_hh-lmsys-oasst-webgpt
Organizations
Papers
1
models
None public yet
datasets
None public yet