gaia-benchmark/results_public
Viewer
•
Updated
•
86
•
1.02k
•
11
Benchmarking General AI Agents
Prompt question?
to Question: prompt question?\nChoices: enumeration of all choices\nAnswer:
), we get a score range of...