AIJapanese
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -42,6 +42,7 @@ The results of other models are taken from the report
|
|
42 |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
43 |
| |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
|
44 |
| |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
|
|
|
45 |
| RakutenAI-7B-chat | 0.9035 | 0.2600 | 0.4619 | 0.8647 | 0.1339 | 0.2120 | 0.2667 | 0.1966 | 0.4504 | 0.2299 | 0.3980 |
|
46 |
| Qwen2-7B-Instruct | 0.8856 | 0.3902 | 0.3859 | 0.8967 | 0.1277 | 0.5720 | 0.2041 | 0.1909 | 0.5713 | **0.5683** | 0.4793 |
|
47 |
| Qwen2.5-7B-Instruct | 0.9151 | 0.4293 | 0.3910 | 0.8908 | 0.1676 | **0.6240** | 0.2108 | 0.1916 | **0.6252** | 0.5305 | 0.4976 |
|
@@ -54,7 +55,6 @@ The results of other models are taken from the report
|
|
54 |
| Llama 3 Swallow 8B Instruct | 0.9178 | 0.4963 | 0.5168 | 0.9088 | 0.1296 | 0.4880 | 0.2522 | 0.2254 | 0.4835 | 0.3927 | 0.4811 |
|
55 |
| Llama 3.1 Swallow 8B Instruct v0.1| 0.9240 | **0.5874** | 0.5736 | 0.9170 | 0.1380 | 0.5080 | 0.2820 | **0.2282** | 0.5301 | 0.3665 | 0.5055 |
|
56 |
| Llama 3.1 Swallow 8B Instruct v0.2| 0.9294 | 0.5601 | 0.5988 | 0.9148 | 0.1372 | 0.5280 | **0.2878** | 0.2270 | 0.5504 | 0.4079 | **0.5141** |
|
57 |
-
| Moriyasu_Qwen2_JP_7B (OURS)| **0.9321** | 0.4823 | **0.6046** | **0.9201** | 0.1382 | 0.5560 | 0.2636 | 0.1892 | 0.5273 | 0.2976 | 0.4911 |
|
58 |
|
59 |
### Japanese MTBench
|
60 |
|
@@ -64,9 +64,9 @@ Due to limited computational resources, we conducted evaluations on only a selec
|
|
64 |
|
65 |
|Model|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
|
66 |
|---|---|---|---|---|---|---|---|---|---|
|
67 |
-
| Moriyasu_Qwen2_JP_7B (OURS) | **
|
68 |
-
| Llama-3-ELYZA-JP-8B |
|
69 |
-
| Llama 3.1 Swallow 8B Instruct v0.1|
|
70 |
|
71 |
### Elyza task 100:
|
72 |
|
|
|
42 |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
43 |
| |4-shot|4-shot|4-shot|4-shot|1-shot|4-shot|4-shot|4-shot|5-shot|0-shot| |
|
44 |
| |EM acc|Char-F1|Char-F1|Char-F1|ROUGE-2|EM acc|BLEU|BLEU|EM acc|pass@1| |
|
45 |
+
| Moriyasu_Qwen2_JP_7B (OURS)| **0.9321** | 0.4823 | **0.6046** | **0.9201** | 0.1382 | 0.5560 | 0.2636 | 0.1892 | 0.5273 | 0.2976 | 0.4911 |
|
46 |
| RakutenAI-7B-chat | 0.9035 | 0.2600 | 0.4619 | 0.8647 | 0.1339 | 0.2120 | 0.2667 | 0.1966 | 0.4504 | 0.2299 | 0.3980 |
|
47 |
| Qwen2-7B-Instruct | 0.8856 | 0.3902 | 0.3859 | 0.8967 | 0.1277 | 0.5720 | 0.2041 | 0.1909 | 0.5713 | **0.5683** | 0.4793 |
|
48 |
| Qwen2.5-7B-Instruct | 0.9151 | 0.4293 | 0.3910 | 0.8908 | 0.1676 | **0.6240** | 0.2108 | 0.1916 | **0.6252** | 0.5305 | 0.4976 |
|
|
|
55 |
| Llama 3 Swallow 8B Instruct | 0.9178 | 0.4963 | 0.5168 | 0.9088 | 0.1296 | 0.4880 | 0.2522 | 0.2254 | 0.4835 | 0.3927 | 0.4811 |
|
56 |
| Llama 3.1 Swallow 8B Instruct v0.1| 0.9240 | **0.5874** | 0.5736 | 0.9170 | 0.1380 | 0.5080 | 0.2820 | **0.2282** | 0.5301 | 0.3665 | 0.5055 |
|
57 |
| Llama 3.1 Swallow 8B Instruct v0.2| 0.9294 | 0.5601 | 0.5988 | 0.9148 | 0.1372 | 0.5280 | **0.2878** | 0.2270 | 0.5504 | 0.4079 | **0.5141** |
|
|
|
58 |
|
59 |
### Japanese MTBench
|
60 |
|
|
|
64 |
|
65 |
|Model|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
|
66 |
|---|---|---|---|---|---|---|---|---|---|
|
67 |
+
| Moriyasu_Qwen2_JP_7B (OURS) | **0.515** | 0.710 | **0.845** | **0.685** | **0.585** | **0.815** | **0.710** | **0.765** | **0.704** |
|
68 |
+
| Llama-3-ELYZA-JP-8B | 0.365 | **0.72** | 0.730 | 0.400 | 0.555 | 0.670 | 0.580 | 0.785 | 0.601 |
|
69 |
+
| Llama 3.1 Swallow 8B Instruct v0.1| 0.480 | 0.680 | 0.705 | 0.475 | 0.425 | 0.710 | 0.620 | 0.645 | 0.592 |
|
70 |
|
71 |
### Elyza task 100:
|
72 |
|