Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20 • 108
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published Nov 11 • 105
When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents? Paper • 2510.17862 • Published Oct 15 • 6
Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs? Paper • 2510.01161 • Published Oct 1 • 13