PRM and fine-tuned LLM used in our PURE github repo: https://github.com/CJReinforce/PURE
Jie Cheng
jinachris
AI & ML interests
Reinforcement learning, LLM
Recent Activity
upvoted
a
paper
about 11 hours ago
STEP3-VL-10B Technical Report
upvoted
a
paper
4 days ago
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning
upvoted
a
collection
about 1 month ago
Nemotron-Post-Training-v3
Organizations
None yet