ryanmarten
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ It outperforms Qwen-2.5-7B-Instruct on math reasoning benchmarks:
|
|
27 |
||Bespoke-Stratos-7B|Qwen2.5-7B-Instruct|DeepSeek-R1-Distill-Qwen-7B (Ours)|DeepSeek-R1-Distill-Qwen-7B (Reported)|
|
28 |
|---|---|---|---|---|
|
29 |
|AIME2024|20.0|10.0|43.3|55.5|
|
30 |
-
|MATH500|82.0|74.2|89.4|
|
31 |
|GPQA-Diamond|37.8|33.3|44.9|49.1|
|
32 |
|LiveCodeBench v2 Easy|71.4|65.9|81.3|-|
|
33 |
|LiveCodeBench v2 Medium|25.5|18.9|42.2|-|
|
|
|
27 |
||Bespoke-Stratos-7B|Qwen2.5-7B-Instruct|DeepSeek-R1-Distill-Qwen-7B (Ours)|DeepSeek-R1-Distill-Qwen-7B (Reported)|
|
28 |
|---|---|---|---|---|
|
29 |
|AIME2024|20.0|10.0|43.3|55.5|
|
30 |
+
|MATH500|82.0|74.2|89.4|92.8|
|
31 |
|GPQA-Diamond|37.8|33.3|44.9|49.1|
|
32 |
|LiveCodeBench v2 Easy|71.4|65.9|81.3|-|
|
33 |
|LiveCodeBench v2 Medium|25.5|18.9|42.2|-|
|