|
--- |
|
license: gemma |
|
datasets: |
|
- Magpie-Align/Magpie-Pro-MT-300K-v0.1 |
|
- Magpie-Align/Magpie-Qwen2-Pro-300K-Filtered |
|
- iknow-lab/qarv-instruct-ko-mt-deduped |
|
- jojo0217/korean_safe_conversation |
|
- heegyu/HRC |
|
- heegyu/orca-math-korean-preference-cleaned |
|
- iknow-lab/ko-evol-writing-wiki |
|
- CarrotAI/ko-instruction-dataset |
|
- maywell/kiqu_samples |
|
- HAERAE-HUB/K2-Feedback |
|
language: |
|
- ko |
|
- en |
|
- zh |
|
--- |
|
|
|
<img src="mandoo.webp" /> |
|
|
|
Mandoo is a LM assistant supporting English, Chinese and Korean. |
|
|
|
### Example |
|
```python |
|
from transformers import pipeline |
|
|
|
pipe = pipeline("text-generation", model="heegyu/mandoo-9b-2407", device_map="auto", torch_dtype="auto") |
|
|
|
messages = [ |
|
{"role": "user", "content": "I want to start saving some money by growing my own food. Can I do this during the winter with an indoor garden?"}, |
|
] |
|
pipe(messages, max_new_tokens=128, do_sample=True) |
|
``` |
|
|
|
# Benchmark Result |
|
Every generation of this model was sampled with temperature=0.7, top_p=0.9, top_k=50 |
|
|
|
## Korean |
|
| Model | ์ฑ๊ธํด | |
|
|---|---| |
|
| gemma-2-9b-it | 7.45 | |
|
| **mandoo-9b-2407-sft** | 6.50 | |
|
|
|
I used sampling with temperature=0.7, max_new_tokens=2048 for generation. |
|
|
|
|
|
``` |
|
# mandoo-9b-2407-sft |
|
์นดํ
๊ณ ๋ฆฌ: ์ถ๋ก (Reasoning), ์ฑ๊ธ ์ ์ ํ๊ท : 6.86, ๋ฉํฐ ์ ์ ํ๊ท : 3.86 |
|
์นดํ
๊ณ ๋ฆฌ: ์ํ(Math), ์ฑ๊ธ ์ ์ ํ๊ท : 5.14, ๋ฉํฐ ์ ์ ํ๊ท : 3.71 |
|
์นดํ
๊ณ ๋ฆฌ: ๊ธ์ฐ๊ธฐ(Writing), ์ฑ๊ธ ์ ์ ํ๊ท : 7.29, ๋ฉํฐ ์ ์ ํ๊ท : 7.00 |
|
์นดํ
๊ณ ๋ฆฌ: ์ฝ๋ฉ(Coding), ์ฑ๊ธ ์ ์ ํ๊ท : 8.29, ๋ฉํฐ ์ ์ ํ๊ท : 8.14 |
|
์นดํ
๊ณ ๋ฆฌ: ์ดํด(Understanding), ์ฑ๊ธ ์ ์ ํ๊ท : 9.29, ๋ฉํฐ ์ ์ ํ๊ท : 8.57 |
|
์นดํ
๊ณ ๋ฆฌ: ๋ฌธ๋ฒ(Grammar), ์ฑ๊ธ ์ ์ ํ๊ท : 6.43, ๋ฉํฐ ์ ์ ํ๊ท : 3.43 |
|
์ ์ฒด ์ฑ๊ธ ์ ์ ํ๊ท : 7.21 |
|
์ ์ฒด ๋ฉํฐ ์ ์ ํ๊ท : 5.79 |
|
์ ์ฒด ์ ์: 6.50 |
|
|
|
# gemma-2-9b-it |
|
์นดํ
๊ณ ๋ฆฌ: ์ถ๋ก (Reasoning), ์ฑ๊ธ ์ ์ ํ๊ท : 9.43, ๋ฉํฐ ์ ์ ํ๊ท : 6.71 |
|
์นดํ
๊ณ ๋ฆฌ: ์ํ(Math), ์ฑ๊ธ ์ ์ ํ๊ท : 6.14, ๋ฉํฐ ์ ์ ํ๊ท : 8.57 |
|
์นดํ
๊ณ ๋ฆฌ: ๊ธ์ฐ๊ธฐ(Writing), ์ฑ๊ธ ์ ์ ํ๊ท : 8.71, ๋ฉํฐ ์ ์ ํ๊ท : 8.86 |
|
์นดํ
๊ณ ๋ฆฌ: ์ฝ๋ฉ(Coding), ์ฑ๊ธ ์ ์ ํ๊ท : 7.43, ๋ฉํฐ ์ ์ ํ๊ท : 6.86 |
|
์นดํ
๊ณ ๋ฆฌ: ์ดํด(Understanding), ์ฑ๊ธ ์ ์ ํ๊ท : 8.29, ๋ฉํฐ ์ ์ ํ๊ท : 8.29 |
|
์นดํ
๊ณ ๋ฆฌ: ๋ฌธ๋ฒ(Grammar), ์ฑ๊ธ ์ ์ ํ๊ท : 6.29, ๋ฉํฐ ์ ์ ํ๊ท : 3.86 |
|
์ ์ฒด ์ฑ๊ธ ์ ์ ํ๊ท : 7.71 |
|
์ ์ฒด ๋ฉํฐ ์ ์ ํ๊ท : 7.19 |
|
์ ์ฒด ์ ์: 7.45 |
|
``` |
|
|
|
## English |
|
### AlpacaEval |
|
``` |
|
length_controlled_winrate win_rate standard_error n_total avg_length |
|
gpt-4o-2024-05-13 57.46 51.33 1.47 805 1873 |
|
gpt-4-turbo-2024-04-09 55.02 46.12 1.47 805 1802 |
|
gpt4_1106_preview 50.00 50.00 0.00 805 2049 |
|
claude-3-opus-20240229 40.51 29.11 1.39 805 1388 |
|
claude-3-sonnet-20240229 34.87 25.56 1.34 805 1420 |
|
Meta-Llama-3-70B-Instruct 34.42 33.18 1.39 805 1919 |
|
gemini-pro 24.38 18.18 1.16 805 1456 |
|
Mixtral-8x7B-Instruct-v0.1 23.69 18.26 1.19 805 1465 |
|
Meta-Llama-3-8B-Instruct 22.92 22.57 1.26 805 1899 |
|
**heegyu/mandoo-9b-2407-sft** <--- 19.82 18.18 1.13 805 1847 |
|
Mistral-7B-Instruct-v0.2 17.11 14.72 1.08 805 1676 |
|
alpaca-7b 5.88 2.59 0.49 805 396 |
|
``` |
|
|
|
### IFEval |
|
| Model | ์ฑ๊ธํด | |
|
|---|---| |
|
| gemma-2-9b-it | 76.95 | |
|
| **mandoo-9b-2407-sft** | 59.19 | |
|
|
|
``` |
|
Strict Accuracy Scores: Avg 0.59191279139 |
|
prompt-level: 0.5471349353049908 |
|
instruction-level: 0.6366906474820144 |
|
|
|
Loose Accuracy Scores: |
|
prompt-level: 0.589648798521257 |
|
instruction-level: 0.6774580335731415 |
|
``` |
|
|
|
|