mfajcik commited on
Commit
889e149
Β·
verified Β·
1 Parent(s): dab68b3

Update content.py

Browse files

Fixed articles in the description, and reformatted for clarity.

Files changed (1) hide show
  1. content.py +10 -10
content.py CHANGED
@@ -5,18 +5,18 @@ HEADER_MARKDOWN = """
5
  # πŸ‡¨πŸ‡Ώ BenCzechMark
6
 
7
  Welcome to the leaderboard!
8
- Here you can compare models on tasks in Czech language and/or submit your own model. We use our modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness) to evaluate every model under same protocol.
9
 
10
-
11
- - Head to **Submission** page to learn about submission details.
12
- - See **About** page for brief description of our evaluation protocol & win score mechanism, citation information, and future directions for this benchmark.
13
  - __How scoring works__:
14
- - On each task, the __Duel Win Score__ reports proportion of won duels.
15
- - Category scores are obtained by averaging across category tasks. When selecting a category (other then Overall), the "Average" column shows Category Duel Win Scores.
16
- - __Overall__ Duel Win Scores are an average over category scores. When selecting Overall category, the "Average" column shows Overall Duel Win Score.
17
- - All public submissions are shared in [CZLC/LLM_benchmark_data](https://huggingface.co/datasets/CZLC/LLM_benchmark_data) dataset.
18
- - In submission page, __you can obtain results on leaderboard without publishing them__.
19
- - First step is "pre-submission", and after this is done (significance tests can take up to an hour), the results can be submitted if you'd like to.
 
20
 
21
  """
22
  LEADERBOARD_TAB_TITLE_MARKDOWN = """
 
5
  # πŸ‡¨πŸ‡Ώ BenCzechMark
6
 
7
  Welcome to the leaderboard!
8
+ Here, you can compare models on tasks in the Czech language or submit your own model. We use a modified fork of [lm-evaluation-harness](https://github.com/DCGM/lm-evaluation-harness) to evaluate every model under the same protocol.
9
 
10
+ - Visit the **Submission** page to learn about how to submit your model.
11
+ - Check out the **About** page for a brief overview of our evaluation protocol, win score mechanism, citation details, and future plans for this benchmark.
 
12
  - __How scoring works__:
13
+ - For each task, the __Duel Win Score__ reflects the proportion of duels a model has won.
14
+ - Category scores are calculated by averaging scores across all tasks within that category. When viewing a specific category (other than Overall), the "Average" column displays the Category Duel Win Scores.
15
+ - The __Overall__ Duel Win Score is the average across all category scores. When selecting the Overall category, the "Average" column shows the Overall Duel Win Score.
16
+ - All public submissions are available in the [CZLC/LLM_benchmark_data](https://huggingface.co/datasets/CZLC/LLM_benchmark_data) dataset.
17
+ - On the submission page, __you can view your model's results on the leaderboard without publishing them__.
18
+ - The first step is "pre-submission." After this is complete (significance tests may take up to an hour), you can choose to submit the results if you wish.
19
+
20
 
21
  """
22
  LEADERBOARD_TAB_TITLE_MARKDOWN = """