Empty main benchmark scores

#1007
by ymcki - opened

I am getting empty results for the main benchmarks from my recent two submissions:
https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18/results_2024-11-04T02-06-36.084992.json
https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18-merge/results_2024-11-04T02-06-46.701689.json

Previous submissions about five days were good. What's going on?

Is it possible to calculate the main benchmark scores from the sub scores?

Open LLM Leaderboard org

Hi @ymcki ,

Thank you for reporting!

We've updated the Harness version for our evaluations and encounter this issue – everything is correct know, so I'll resubmit your models to get correct results

Thank you for your patience!

alozowski changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment