Spaces:
Running
Running
Update content.py
Browse files- content.py +10 -8
content.py
CHANGED
@@ -19,8 +19,9 @@ Here, you can compare models on tasks in the Czech language or submit your own m
|
|
19 |
- On the submission page, __you can view your model's results on the leaderboard without publishing them__.
|
20 |
- The first step is "pre-submission." After this is complete (significance tests may take up to 2 hours), you can choose to submit the results if you wish.
|
21 |
- NEWS:
|
22 |
-
-
|
23 |
- 7.11.2024: We acknowledge that one of the Qwen2.5 models correctly predicted our (& Bigbench's) canary string. This confirms the contamination, it was trained on benchmark data. Other [studies](https://arxiv.org/pdf/2409.01790) also suggest the contamination issues of the Qwen family.
|
|
|
24 |
|
25 |
"""
|
26 |
LEADERBOARD_TAB_TITLE_MARKDOWN = """
|
@@ -131,12 +132,14 @@ The models submitted to leaderboard by the authors were evaluated in following s
|
|
131 |
## Citation
|
132 |
You can use the following citation for this leaderboard and our upcoming work.
|
133 |
```bibtex
|
134 |
-
@
|
135 |
-
|
136 |
-
|
137 |
-
|
138 |
-
|
139 |
-
|
|
|
|
|
140 |
}
|
141 |
```
|
142 |
|
@@ -159,7 +162,6 @@ You can use the following citation for this leaderboard and our upcoming work.
|
|
159 |
- Adam Jirkovský
|
160 |
- David Adamczyk
|
161 |
- Jan Hůla
|
162 |
-
- Jan Šedivý
|
163 |
- **Hugging Face**
|
164 |
- Hynek Kydlíček
|
165 |
|
|
|
19 |
- On the submission page, __you can view your model's results on the leaderboard without publishing them__.
|
20 |
- The first step is "pre-submission." After this is complete (significance tests may take up to 2 hours), you can choose to submit the results if you wish.
|
21 |
- NEWS:
|
22 |
+
- 23.12.2024: We released [a preprint](http://arxiv.org/abs/2412.17933) detailing our work.
|
23 |
- 7.11.2024: We acknowledge that one of the Qwen2.5 models correctly predicted our (& Bigbench's) canary string. This confirms the contamination, it was trained on benchmark data. Other [studies](https://arxiv.org/pdf/2409.01790) also suggest the contamination issues of the Qwen family.
|
24 |
+
- 1.10.2024: Find out more about 🇨🇿 BenCzechMark in our [Huggingface blogpost](https://huggingface.co/blog/benczechmark)!
|
25 |
|
26 |
"""
|
27 |
LEADERBOARD_TAB_TITLE_MARKDOWN = """
|
|
|
132 |
## Citation
|
133 |
You can use the following citation for this leaderboard and our upcoming work.
|
134 |
```bibtex
|
135 |
+
@misc{benczechmark,
|
136 |
+
title={BenCzechMark : A Czech-centric Multitask and Multimetric Benchmark for Large Language Models with Duel Scoring Mechanism},
|
137 |
+
author={Martin Fajcik and Martin Docekal and Jan Dolezal and Karel Ondrej and Karel Beneš and Jan Kapsa and Pavel Smrz and Alexander Polok and Michal Hradis and Zuzana Neverilova and Ales Horak and Radoslav Sabol and Michal Stefanik and Adam Jirkovsky and David Adamczyk and Petr Hyner and Jan Hula and Hynek Kydlicek},
|
138 |
+
year={2024},
|
139 |
+
eprint={2412.17933},
|
140 |
+
archivePrefix={arXiv},
|
141 |
+
primaryClass={cs.CL},
|
142 |
+
url={https://arxiv.org/abs/2412.17933},
|
143 |
}
|
144 |
```
|
145 |
|
|
|
162 |
- Adam Jirkovský
|
163 |
- David Adamczyk
|
164 |
- Jan Hůla
|
|
|
165 |
- **Hugging Face**
|
166 |
- Hynek Kydlíček
|
167 |
|