arxiv:2508.19652
Haitao Mi
haitaominlp
AI & ML interests
Large Language Models
Recent Activity
upvoted
a
collection
4 days ago
Olmo 3
upvoted
a
paper
about 2 months ago
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning