Update README.md
Browse files
README.md
CHANGED
@@ -37,7 +37,7 @@ All evaluations are conducted in a zero-shot setting.
|
|
37 |
|  - Other | **41.18** | 39.06 |
|
38 |
|  - Social-Science | **44.16** | 41.98 |
|
39 |
| **[MMLU-Redux](https://github.com/yuchenlin/ZeroEval)** | **57.24**| 56.91 |
|
40 |
-
| **[GSM8K](https://github.com/yuchenlin/ZeroEval)** | **
|
41 |
| **[MATH-L5](https://github.com/yuchenlin/ZeroEval)** | **19.97**| 16.23 |
|
42 |
| **[CRUX](https://github.com/yuchenlin/ZeroEval)** | **31.25**| 25.25 |
|
43 |
| **[AlpacaEval](https://github.com/tatsu-lab/alpaca_eval)** | **23.87**| 19.35 |
|
|
|
37 |
|  - Other | **41.18** | 39.06 |
|
38 |
|  - Social-Science | **44.16** | 41.98 |
|
39 |
| **[MMLU-Redux](https://github.com/yuchenlin/ZeroEval)** | **57.24**| 56.91 |
|
40 |
+
| **[GSM8K](https://github.com/yuchenlin/ZeroEval)** | **67.25**| 57.16 |
|
41 |
| **[MATH-L5](https://github.com/yuchenlin/ZeroEval)** | **19.97**| 16.23 |
|
42 |
| **[CRUX](https://github.com/yuchenlin/ZeroEval)** | **31.25**| 25.25 |
|
43 |
| **[AlpacaEval](https://github.com/tatsu-lab/alpaca_eval)** | **23.87**| 19.35 |
|