Update space
Browse files
app.py
CHANGED
@@ -141,7 +141,7 @@ with demo:
|
|
141 |
'for large language model (LLM) evaluation across diverse, fine-grained dimensions, '
|
142 |
'such as mathematics (algebra, geometry, probability), logical reasoning, social reasoning, science (chemistry, physics, biology), or any user-defined dimensions. '
|
143 |
'The evaluation is decentralized and democratic, with all participating LLMs assessing each other to ensure unbiased and fair results. '
|
144 |
-
'With a 95
|
145 |
'</p>'
|
146 |
f'<p style="font-size:{INTRODUCTION_TEXT_FONT_SIZE}px;">'
|
147 |
'We actively invite <b>model developers</b> to participate and expedite their benchmarking efforts '
|
|
|
141 |
'for large language model (LLM) evaluation across diverse, fine-grained dimensions, '
|
142 |
'such as mathematics (algebra, geometry, probability), logical reasoning, social reasoning, science (chemistry, physics, biology), or any user-defined dimensions. '
|
143 |
'The evaluation is decentralized and democratic, with all participating LLMs assessing each other to ensure unbiased and fair results. '
|
144 |
+
'With a 95% correlation to Chatbot Arena\'s overall rankings, the system is fully transparent and reproducible.'
|
145 |
'</p>'
|
146 |
f'<p style="font-size:{INTRODUCTION_TEXT_FONT_SIZE}px;">'
|
147 |
'We actively invite <b>model developers</b> to participate and expedite their benchmarking efforts '
|