AIJapanese
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -24,7 +24,7 @@ We used the [lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluatio
|
|
24 |
|---|---|---|---|---|---|---|---|---|---|
|
25 |
| |3-shot|3-shot|0-shot|2-shot|1-shot|1-shot|0-shot|5-shot| |
|
26 |
| |Acc.|Balanced Acc.|Balanced Acc.|Char-F1|Char-F1|ROUGE-2|Acc.|Acc.| |
|
27 |
-
| Moriyasu_Qwen2_JP_7B (
|
28 |
| Qwen2-7B-Instruct | 0.9080 | 0.7807 | 0.9329 | 0.9290 | 0.8334 | 0.1905 | 0.7216 | **0.6120** | 0.7385 |
|
29 |
| SakanaAI/EvoLLM-JP-v1-7B | 0.8919 | 0.6602 | 0.9555 | 0.9210 | 0.8641 | **0.2331** | 0.8165 | 0.4760 | 0.7273 |
|
30 |
| Llama-3-ELYZA-JP-8B |0.9240 | 0.6485 | **0.9567** | 0.9204 | 0.8743 | 0.2135 | 0.7821 | 0.4920 | 0.7264 |
|
@@ -42,7 +42,7 @@ The results of other models are taken from the report
|
|
42 |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
43 |
| |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
|
44 |
| |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
|
45 |
-
| Moriyasu_Qwen2_JP_7B (
|
46 |
| RakutenAI-7B-chat | 0.9035 | 0.2600 | 0.4619 | 0.8647 | 0.1339 | 0.2120 | 0.2667 | 0.1966 | 0.4504 | 0.2299 | 0.3980 |
|
47 |
| Qwen2-7B-Instruct | 0.8856 | 0.3902 | 0.3859 | 0.8967 | 0.1277 | 0.5720 | 0.2041 | 0.1909 | 0.5713 | **0.5683** | 0.4793 |
|
48 |
| Qwen2.5-7B-Instruct | 0.9151 | 0.4293 | 0.3910 | 0.8908 | 0.1676 | **0.6240** | 0.2108 | 0.1916 | **0.6252** | 0.5305 | 0.4976 |
|
@@ -64,7 +64,7 @@ Due to limited computational resources, we conducted evaluations on only a selec
|
|
64 |
|
65 |
|Model|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
|
66 |
|---|---|---|---|---|---|---|---|---|---|
|
67 |
-
| Moriyasu_Qwen2_JP_7B (
|
68 |
| Llama-3-ELYZA-JP-8B | 0.365 | **0.72** | 0.730 | 0.400 | 0.555 | 0.670 | 0.580 | 0.785 | 0.601 |
|
69 |
| Llama 3.1 Swallow 8B Instruct v0.1| 0.480 | 0.680 | 0.705 | 0.475 | 0.425 | 0.710 | 0.620 | 0.645 | 0.592 |
|
70 |
|
@@ -74,7 +74,7 @@ For this benchmark, we use [Elyza task 100](https://huggingface.co/datasets/ely
|
|
74 |
|
75 |
|Model|Score|
|
76 |
|---|---|
|
77 |
-
| Moriyasu_Qwen2_JP_7B (
|
78 |
| Llama-3-ELYZA-JP-8B | **3.66** |
|
79 |
| Llama 3.1 Swallow 8B Instruct v0.1| 3.32 |
|
80 |
|
|
|
24 |
|---|---|---|---|---|---|---|---|---|---|
|
25 |
| |3-shot|3-shot|0-shot|2-shot|1-shot|1-shot|0-shot|5-shot| |
|
26 |
| |Acc.|Balanced Acc.|Balanced Acc.|Char-F1|Char-F1|ROUGE-2|Acc.|Acc.| |
|
27 |
+
| Moriyasu_Qwen2_JP_7B (ours) | **0.9491** | **0.9111** | 0.9550 | 0.8748 | 0.8924 | 0.1966 | **0.8238** | 0.5560 | **0.7699** |
|
28 |
| Qwen2-7B-Instruct | 0.9080 | 0.7807 | 0.9329 | 0.9290 | 0.8334 | 0.1905 | 0.7216 | **0.6120** | 0.7385 |
|
29 |
| SakanaAI/EvoLLM-JP-v1-7B | 0.8919 | 0.6602 | 0.9555 | 0.9210 | 0.8641 | **0.2331** | 0.8165 | 0.4760 | 0.7273 |
|
30 |
| Llama-3-ELYZA-JP-8B |0.9240 | 0.6485 | **0.9567** | 0.9204 | 0.8743 | 0.2135 | 0.7821 | 0.4920 | 0.7264 |
|
|
|
42 |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
43 |
| |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
|
44 |
| |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
|
45 |
+
| Moriyasu_Qwen2_JP_7B (ours)| **0.9321** | 0.4823 | **0.6046** | **0.9201** | 0.1382 | 0.5560 | 0.2636 | 0.1892 | 0.5273 | 0.2976 | 0.4911 |
|
46 |
| RakutenAI-7B-chat | 0.9035 | 0.2600 | 0.4619 | 0.8647 | 0.1339 | 0.2120 | 0.2667 | 0.1966 | 0.4504 | 0.2299 | 0.3980 |
|
47 |
| Qwen2-7B-Instruct | 0.8856 | 0.3902 | 0.3859 | 0.8967 | 0.1277 | 0.5720 | 0.2041 | 0.1909 | 0.5713 | **0.5683** | 0.4793 |
|
48 |
| Qwen2.5-7B-Instruct | 0.9151 | 0.4293 | 0.3910 | 0.8908 | 0.1676 | **0.6240** | 0.2108 | 0.1916 | **0.6252** | 0.5305 | 0.4976 |
|
|
|
64 |
|
65 |
|Model|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
|
66 |
|---|---|---|---|---|---|---|---|---|---|
|
67 |
+
| Moriyasu_Qwen2_JP_7B (ours) | **0.515** | 0.710 | **0.845** | **0.685** | **0.585** | **0.815** | **0.710** | **0.765** | **0.704** |
|
68 |
| Llama-3-ELYZA-JP-8B | 0.365 | **0.72** | 0.730 | 0.400 | 0.555 | 0.670 | 0.580 | 0.785 | 0.601 |
|
69 |
| Llama 3.1 Swallow 8B Instruct v0.1| 0.480 | 0.680 | 0.705 | 0.475 | 0.425 | 0.710 | 0.620 | 0.645 | 0.592 |
|
70 |
|
|
|
74 |
|
75 |
|Model|Score|
|
76 |
|---|---|
|
77 |
+
| Moriyasu_Qwen2_JP_7B (ours) | 3.37 |
|
78 |
| Llama-3-ELYZA-JP-8B | **3.66** |
|
79 |
| Llama 3.1 Swallow 8B Instruct v0.1| 3.32 |
|
80 |
|