U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs Paper • 2412.03205 • Published Dec 4, 2024 • 16
Beemo: Benchmark of Expert-edited Machine-generated Outputs Paper • 2411.04032 • Published Nov 6, 2024
U-MATH and μ-MATH - University-level math evaluation Collection Paper: A UNIVERSITY-LEVEL BENCHMARK FOR EVALUATING MATHEMATICAL SKILLS IN LLMS • 3 items • Updated Dec 12, 2024 • 15
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs Paper • 2412.03205 • Published Dec 4, 2024 • 16