Update README.md
Browse files
README.md
CHANGED
@@ -95,3 +95,15 @@ while True:
|
|
95 |
|
96 |
```
|
97 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
|
96 |
```
|
97 |
|
98 |
+
## Evaluations
|
99 |
+
The following data has been re-evaluated and calculated as the average for each test.
|
100 |
+
|
101 |
+
| Benchmark | Qwen2.5-Coder-1.5B-Instruct | Qwen2.5-Coder-1.5B-Instruct-abliterated |
|
102 |
+
|-------------|-----------------------------|-----------------------------------------|
|
103 |
+
| IF_Eval | 43.43 | **45.41** |
|
104 |
+
| MMLU Pro | 21.5 | 20.57 |
|
105 |
+
| TruthfulQA | 46.07 | 41.9 |
|
106 |
+
| BBH | 36.67 | 36.09 |
|
107 |
+
| GPQA | 28.00 | 26.13 |
|
108 |
+
|
109 |
+
The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-Coder-1.5B-Instruct-abliterated/blob/main/eval.sh)
|