--- license: apache-2.0 language: - ja base_model: - Qwen/Qwen2-7B pipeline_tag: text-generation library_name: transformers --- # Moriyasu_Qwen2_JP_7B ### Model Description **Moriyasu_Qwen2_JP_7B** is a is a large language model trained by Moriyasu. Based on [Qwen/Qwen2-7B](https://huggingface.co/Qwen/Qwen2-7B), it has been enhanced for Japanese usage through additional pre-training and instruction tuning. # Training Datasets ### Pre-training dataset The model is continually pre-trained on Japanese data from the Qwen2-7b model while maintaining the model's English ability (80% Japanese, 20% English). We use about 120 billion tokens sampled from, Japanese and English Wikipedia articles, Japanese CC-100 Japanese C4, Japanese OSCAR ,The Pile, Webfined, Japanese websites, book data, mathematics and code,... ### Instruction Tuning We generated about 1 million Instruction data from various methods such as generated data, translated data, and data manually tagged by humans. # Model Performance ### JGLUE tasks We used the [lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable) repo to evaluate across 8 tasks, and the results are as follows: |Model|JCommonsenseQA|JNLI|JMARC|JSQuAD|JAQKET-V2|XL-SUM|XWINOGRAD|MGSM|JA AVG| |---|---|---|---|---|---|---|---|---|---| | |3-shot|3-shot|0-shot|2-shot|1-shot|1-shot|0-shot|5-shot| | | |Acc.|Balanced Acc.|Balanced Acc.|Char-F1|Char-F1|ROUGE-2|Acc.|Acc.| | | Moriyasu_Qwen2_JP_7B (OURS) | **94.91** | **91.11** | 95.50 | 87.48 | 89.24 | 19.66 | **82.38** | 55.60 | **76.99** | | Qwen2-7B-Instruct | 90.80 | 78.07 | 93.29 | 92.90 | 83.34 | 19.05 | 72.16 | **61.20** | 73.85 | | SakanaAI/EvoLLM-JP-v1-7B | 89.19 | 66.02 | 95.55 | 92.10 | 86.41 | **23.31** | 81.65 | 47.60 | 72.73 | | Llama-3-ELYZA-JP-8B |92.40 | 64.85 | **95.67** | 92.04 | 87.43 | 21.35 | 78.21 | 49.20 | 72.64 | | Llama-3-Swallow-8B-Instruct-v0.1 | 92.49 | 62.12 | 94.27 | **93.73** | **90.83** | 19.61 | 74.04 | 50.00 | 72.14 | | Tanuki-8B-dpo-v1.0| 79.18 | 43.05 | 92.26 | 82.29 | 77.99 | 11.68 | 70.39 | 43.60 | 62.56 |