Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1135

Empty main benchmark scores

#1007

by ymcki - opened Nov 4, 2024

Discussion

ymcki

Nov 4, 2024

I am getting empty results for the main benchmarks from my recent two submissions:
https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18/results_2024-11-04T02-06-36.084992.json
https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/ymcki/gemma-2-2b-ORPO-jpn-it-abliterated-18-merge/results_2024-11-04T02-06-46.701689.json

Previous submissions about five days were good. What's going on?

Is it possible to calculate the main benchmark scores from the sub scores?

alozowski

Open LLM Leaderboard org Nov 4, 2024

Hi @ymcki ,

Thank you for reporting!

We've updated the Harness version for our evaluations and encounter this issue – everything is correct know, so I'll resubmit your models to get correct results

Thank you for your patience!

alozowski changed discussion status to closed Nov 4, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment