Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Code for evaluating new models?
#620
by
YannDubs
- opened
Hi, is the exact script used to run the open_lm_leaderboard open-sourced? I only found vague commands that suggest using lm_eval.
Thanks!
Hi!
You can find the precise steps to reproduce our evaluations in the About tab - the Open LLM Leaderboard uses the harness for evaluation.
clefourrier
changed discussion status to
closed