SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper • 2502.12115 • Published 5 days ago • 41
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering Paper • 2410.07095 • Published Oct 9, 2024 • 6