Spaces:

Bowieee
/

StructEval_leaderboard

Sleeping

Bowieee commited on Aug 6

Commit

81630e4

•

1 Parent(s): 7fb003e

update content

Files changed (1) hide show

text_content.py CHANGED Viewed

@@ -37,5 +37,5 @@ Inspired from the [🤗 Open LLM Leaderboard](https://huggingface.co/spaces/Hugg
 NOTES_TEXT = """
 * On most models on base MMLU, we collected the results for their official technical report. For the models that have not been reported, we use opencompass for evaluation.
-* For other 2 base benchmarks and all 3 structured benchmarks: for chat models, we evaluate them under 0-shot setting; for completion model, we evaluate them under 0-shot setting with ppl.
 """

 NOTES_TEXT = """
 * On most models on base MMLU, we collected the results for their official technical report. For the models that have not been reported, we use opencompass for evaluation.
+* For other 2 base benchmarks and all 3 structured benchmarks: for chat models, we evaluate them under 0-shot setting; for completion model, we evaluate them under 0-shot setting with ppl. And we keep the prompt format consistent across all benchmarks.
 """