ArkaAbacus
commited on
Commit
•
f31412c
1
Parent(s):
2064815
Update README.md
Browse files
README.md
CHANGED
@@ -27,6 +27,14 @@ We ran MT-Bench with the Qwen conversation template.
|
|
27 |
| Qwen1.5-72B-Chat | 8.59 | 8.08 | 8.34 |
|
28 |
| Smaug-2-72B | 8.86 | 8.20 | 8.53
|
29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
|
31 |
## Model Details
|
32 |
|
|
|
27 |
| Qwen1.5-72B-Chat | 8.59 | 8.08 | 8.34 |
|
28 |
| Smaug-2-72B | 8.86 | 8.20 | 8.53
|
29 |
|
30 |
+
#### HumanEval
|
31 |
+
|
32 |
+
We ran HumanEval with pass@1 with the Qwen conversation template. Smaug-2 outperforms Qwen1.5-72B-Chat by approximately 10%:
|
33 |
+
|
34 |
+
| Model | pass@1 (%) |
|
35 |
+
| ------| ---------- |
|
36 |
+
| Qwen1.5-72B-Chat | 56.7 |
|
37 |
+
| Smaug-2-72B | 66.5 |
|
38 |
|
39 |
## Model Details
|
40 |
|