Update README.md
Browse files
README.md
CHANGED
@@ -7,14 +7,33 @@ pipeline_tag: text-generation
|
|
7 |
|
8 |
**Llama3.1-Typhoon2-8B-instruct** is a instruct Thai 🇹🇠large language model with 8 billion parameters, and it is based on Llama3.1-8B.
|
9 |
|
10 |
-
| Model | IFEval - TH | IFEval - EN | MT-Bench TH | MT-Bench EN | Thai Code-Switching(t=0.7) | Thai Code-Switching(t=1.0) | FunctionCall-TH | FunctionCall-EN | GSM8K-TH | GSM8K-EN | MATH-TH | MATH-EN | HumanEval-TH | HumanEval-EN | MBPP-TH | MBPP-EN |
|
11 |
-
|--------------------------------|-------------|-------------|-------------|-------------|--------------------------------|--------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-------------|-------------|-----------|-----------|
|
12 |
-
| **Llama3.1 8b instruct** | 58.04% | **77.64%** | 5.109 | **8.118** | 93% | 11.2% | 36.92% | 66.06% | 45.18% | 62.4% | 24.42% | 48% | 51.8% | 67.7% | **64.6%** | **66.9%** |
|
13 |
-
| **Typhoon2 llama3 8b instruct**| **72.60%** | 76.43% | **5.7417** | 7.584 | **98.8%** | **98%** | **75.12%** | **79.08%** | **71.72%** | **81.0%** | **38.48%** | **49.04%** | **58.5%** | **68.9%** | 60.8% | 63.0% |
|
14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
-
|
|
|
|
|
17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
For release post, please see our [blog](...).
|
20 |
*To acknowledge Meta's effort in creating the foundation model and to comply with the license, we explicitly include "llama-3.1" in the model name.
|
|
|
7 |
|
8 |
**Llama3.1-Typhoon2-8B-instruct** is a instruct Thai 🇹🇠large language model with 8 billion parameters, and it is based on Llama3.1-8B.
|
9 |
|
|
|
|
|
|
|
|
|
10 |
|
11 |
+
## Performance
|
12 |
+
|
13 |
+
**Instruction-Following & Function Call Performance**
|
14 |
+
|
15 |
+
<div align="center">
|
16 |
+
<img src="https://storage.googleapis.com/typhoon-public/assets/typhoon2-text/llama7b_general.png" alt="Typhoon2 Llama 8B General Performance" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
17 |
+
</div>
|
18 |
+
|
19 |
+
**Specific Domain Performance (Math & Coding)**
|
20 |
|
21 |
+
<div align="center">
|
22 |
+
<img src="https://storage.googleapis.com/typhoon-public/assets/typhoon2-text/llama7b_specific.png" alt="TTyphoon2 Llama 8B Specific Domain Performance" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
23 |
+
</div>
|
24 |
|
25 |
+
**Long Context Performance**
|
26 |
+
|
27 |
+
<div align="center">
|
28 |
+
<img src="https://storage.googleapis.com/typhoon-public/assets/typhoon2-text/llama7b_long.jpg" alt="Typhoon2 Llama 8B Long Context Performance" width="100%" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
29 |
+
</div>
|
30 |
+
|
31 |
+
**Detail Performance**
|
32 |
+
|
33 |
+
| Model | IFEval - TH | IFEval - EN | MT-Bench TH | MT-Bench EN | Thai Code-Switching(t=0.7) | Thai Code-Switching(t=1.0) | FunctionCall-TH | FunctionCall-EN | GSM8K-TH | GSM8K-EN | MATH-TH | MATH-EN | HumanEval-TH | HumanEval-EN | MBPP-TH | MBPP-EN |
|
34 |
+
|--------------------------------|-------------|-------------|-------------|-------------|--------------------------------|--------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-------------|-------------|-----------|-----------|
|
35 |
+
| **Llama3.1 8B Instruct** | 58.04% | **77.64%** | 5.109 | **8.118** | 93% | 11.2% | 36.92% | 66.06% | 45.18% | 62.4% | 24.42% | 48% | 51.8% | 67.7% | **64.6%** | **66.9%** |
|
36 |
+
| **Typhoon2 Llama3 8B Instruct**| **72.60%** | 76.43% | **5.7417** | 7.584 | **98.8%** | **98%** | **75.12%** | **79.08%** | **71.72%** | **81.0%** | **38.48%** | **49.04%** | **58.5%** | **68.9%** | 60.8% | 63.0% |
|
37 |
|
38 |
For release post, please see our [blog](...).
|
39 |
*To acknowledge Meta's effort in creating the foundation model and to comply with the license, we explicitly include "llama-3.1" in the model name.
|