AIJapanese
/

Moriyasu_Qwen2_JP_7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AIJapanese commited on Dec 11, 2024

Commit

1781c0a

·

verified ·

1 Parent(s): 6dc4fa5

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -65,4 +65,14 @@ The results of other models are taken from the report
 | Llama 3.1 Swallow 8B Instruct v0.2| 0.9294 | 0.5601 | 0.5988 | 0.9148 | 0.1372 | 0.5280 | **0.2878** | 0.2270 | 0.5504 | 0.4079 | **0.5141** |
 | Moriyasu_Qwen2_JP_7B (OURS)| **0.9321** | 0.4823 | **0.6046** | **0.9201** | 0.1382 | 0.5560 | 0.2636 | 0.1892 | 0.5273 | 0.2976 | 0.4911 |

 | Llama 3.1 Swallow 8B Instruct v0.2| 0.9294 | 0.5601 | 0.5988 | 0.9148 | 0.1372 | 0.5280 | **0.2878** | 0.2270 | 0.5504 | 0.4079 | **0.5141** |
 | Moriyasu_Qwen2_JP_7B (OURS)| **0.9321** | 0.4823 | **0.6046** | **0.9201** | 0.1382 | 0.5560 | 0.2636 | 0.1892 | 0.5273 | 0.2976 | 0.4911 |
+### Japanese MTBench
+For this evaluation, we use [FastChat](https://github.com/Stability-AI/FastChat/tree/jp-stable) and **gpt-4o-2024-08-06** for judgement and reference answer.
+Due to limited computational resources, we conducted evaluations on only a select number of models.
+|Model|coding|extraction|humanities|math|reasoning|roleplay|stem|writing|JMTAvg|
+|---|---|---|---|---|---|---|---|---|---|
+| Moriyasu_Qwen2_JP_7B (OURS)       | **5.15** | 7.10 | **8.45** | **6.85** | **5.85** | **8.15** | **7.10** | **7.65** | **7.04** |
+| Llama-3-ELYZA-JP-8B               | 3.65 | **7.2** | 7.3 | 4.00 | 5.55 | 6.70 | 5.80 | 7.85 | 6.01 |
+| Llama 3.1 Swallow 8B Instruct v0.1| 4.80 | 6.80 | 7.05 | 4.75 | 4.25 | 7.10 | 6.20 | 6.45 | 5.92 |