Spaces:

logikon
/

open_cot_leaderboard

App Files Files Community

Gregor Betz commited on Jan 30

Commit

3fc8d52

•

1 Parent(s): 058891a

table

Files changed (1) hide show

src/display/about.py +26 -7

src/display/about.py CHANGED Viewed

@@ -49,16 +49,35 @@ Each `regime` has a different _accuracy gain Δ_, and the leaderboard reports (f
 ## How is it different from other leaderboards?
-Performance leaderboards like the [🤗 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) or [YALL](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard) do a great job in ranking models according task performance.
 Unlike these leaderboards, the `/\/` Open CoT Leaderboard assess a model's ability to effectively reason about a `task`:
-|🤗 Open LLM Leaderboard |`/\/` Open CoT Leaderboard |
-|---|---|
-|Can `model` solve `task`?|Can `model` do CoT to improve in `task`?|
-|Measures `task` performance.|Measures ability to reason (about `task`).|
-|Metric: absolute accuracy.|Metric: relative accuracy gain.|
-|Covers broad spectrum of `tasks`.|Focuses on critical thinking `tasks`.|
 ## Test dataset selection (`tasks`)

 ## How is it different from other leaderboards?
+Performance leaderboards like the [🤗 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) or [YALL](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard) do a great job in ranking models according to task performance.
 Unlike these leaderboards, the `/\/` Open CoT Leaderboard assess a model's ability to effectively reason about a `task`:
+<table>
+<tr style="text-align:center;">
+<td>🤗 Open LLM Leaderboard </td>
+<td>`/\/` Open CoT Leaderboard </td>
+</tr>
+<tr>
+<td>Can `model` solve `task`?</td>
+<td>Can `model` do CoT to improve in `task`?</td>
+</tr>
+<tr>
+<td>Measures `task` performance.</td>
+<td>Measures ability to reason (about `task`).</td>
+</tr>
+<tr>
+<td>Metric: absolute accuracy.</td>
+<td>Metric: relative accuracy gain.</td>
+</tr>
+<tr>
+<td>Covers broad spectrum of `tasks`.</td>
+<td>Focuses on critical thinking `tasks`.</td>
+</tr>
+</table>
 ## Test dataset selection (`tasks`)