THU-KEG/LongTraceRL-30B
Reinforcement Learning • 31B • Updated • 48 • 1
None defined yet.
EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery
LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards