SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 17 days ago • 94
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published Jan 20 • 92
Large Action Models: From Inception to Implementation Paper • 2412.10047 • Published Dec 13, 2024 • 32