Text Generation
Transformers
Safetensors
English
Japanese
llama
conversational
text-generation-inference
Inference Endpoints
kazukifujii commited on
Commit
fdb58fd
·
verified ·
1 Parent(s): 8d22ee2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -100,7 +100,16 @@ The website [https://swallow-llm.github.io/](https://swallow-llm.github.io/) pro
100
  |---|---|---|---|---|---|---|---|---|---|---|
101
  | |4-shot|4-shot|4-shot|4-shot|4-shot|5-shot|4-shot|3-shot|0-shot| |
102
  | |Acc|EM acc|Acc|EM acc|Acc|Acc|EM acc|CoT EM Acc|pass@1| |
103
- | Llama 3 Youko 70B Instruct | | | | | | | | | | |
 
 
 
 
 
 
 
 
 
104
 
105
 
106
  ## Evaluation Benchmarks
 
100
  |---|---|---|---|---|---|---|---|---|---|---|
101
  | |4-shot|4-shot|4-shot|4-shot|4-shot|5-shot|4-shot|3-shot|0-shot| |
102
  | |Acc|EM acc|Acc|EM acc|Acc|Acc|EM acc|CoT EM Acc|pass@1| |
103
+ | Llama 3 Youko 70B Instruct | 0.4500| 0.7973| 0.6863| 0.3914| 0.9153| 0.8055| 0.8923| 0.7814| 0.6598| 0.7088|
104
+ | Llama-3.1-70B-Japanese-Instruct-2407| 0.4220| 0.8104| 0.6481| 0.3744| 0.9170| 0.8071| 0.8893| 0.8228| 0.7463| 0.7153|
105
+ | Llama 3 heron brain 70B v0.3| 0.4460 |0.8107 |0.6682| 0.4085| 0.9174| 0.7898| 0.8772| 0.7586| 0.6713| 0.7053|
106
+ | Llama 3 70B Instruct |0.4400| 0.7999| 0.6552| 0.4024| 0.9127| 0.7992| 0.9052| 0.8326| 0.7555| 0.7225|
107
+ | Llama 3.1 70B Instruct |0.4300| 0.8212| 0.6621| 0.3921| 0.9157| 0.8213| 0.8764| 0.8390| 0.7915| 0.7277|
108
+ | Llama 3.3 70B Instruct |0.4260| 0.8172| 0.6674| 0.3933| 0.9174| 0.8240| 0.8901| 0.8529| 0.8341| **0.7358**|
109
+ | Llama 3.1 Swallow 70B Instruct v0.1 |0.4520| 0.8148| 0.6834| 0.4012| 0.9157| 0.7855| 0.8886| 0.8486| 0.5823| 0.7080|
110
+ | **Llama 3.1 Swallow 70B Instruct v0.3** |0.4540| 0.8245| 0.6915| 0.4082| 0.9187| 0.7770| 0.8726| 0.8148| 0.6378| 0.7110|
111
+ | Qwen2-72B-Instruct |0.4360| 0.7588| 0.6857| 0.3913| 0.9110| 0.8391| 0.8499| 0.2436| 0.6939| 0.6455|
112
+ | Qwen2.5-72B-Instruct |0.4540| 0.6764| 0.7064| 0.3550| 0.8895| 0.8478| 0.9113| 0.4027| 0.6165| 0.6511|
113
 
114
 
115
  ## Evaluation Benchmarks