Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap Paper • 2402.19450 • Published Feb 29, 2024 • 3