MgGPT
/

MgGPT-13B

Safetensors

Arabic

llama

Model card Files Files and versions Community

jianqing666 commited on Jan 4

Commit

6bd3cb0

verified ·

1 Parent(s): 1a9c4c0

Update README.md

Browse files

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -38,7 +38,7 @@ Models output text only.
 | GPT-4         | 74.08  | 65.06          | 72.50                | 85.67 | 57.76 | 84.06        | 79.43      |
 <!-- Benchmark evaluation on [Arabic MMLU](https://github.com/FreedomIntelligence/AceGPT) are conducted using accuracy scores as metrics, following the evaluation framework available at https://github.com/FreedomIntelligence/AceGPT/tree/main. -->
-|                  | STEM | Humanities | Social Sciences | Others | Average |
 |------------------|------|------|------|------|------|
 | Bloomz-7B-base   | 33.35 | 29.29 | 37.58 | 34.53 | 33.69 |
 | LLaMA2-7B-base   | 30.30 | 29.33 | 27.46 | 30.78 | 29.37 |
@@ -49,11 +49,11 @@ Models output text only.
 | Jais-30B-v1-base | 32.67 | 30.67 | 42.13 | 39.60 | 36.27 |
 | ChatGPT 3.5 Turbo | **43.38** | **44.12** | **55.57** | **53.21** | **49.07** |
-<!-- | AceGPT-13B-base  | 36.60 | 38.74 | 43.76 | <u>42.72</u> | 40.45 | -->
-<!-- | AceGPT-7B-base   | 29.73 | 30.95 | 33.45 | 34.42 | 32.14 | -->
-Benchmark evaluation on [ArabicMMLU]((https://github.com/mbzuai-nlp/ArabicMMLU)), and assessed based on its source settings.
 |                  | STEM | Social Sciences | Humanities | Arabic Language | Other | Average |
 |------------------|------|------|------|------|------|------|
 | Bloomz-7B-base   | - | - | - | - | - | - |
@@ -65,8 +65,8 @@ Benchmark evaluation on [ArabicMMLU]((https://github.com/mbzuai-nlp/ArabicMMLU))
 | Jais-30B-v1-base | 39.5 | 45.6 | <u>50.5</u> | 34.6 | 49.1 | 44.8 |
 | ChatGPT 3.5 Turbo | **53.8** | **57.0** | **57.5** | **57.6** | **63.8** | **57.7** |
-<!-- | AceGPT-7B-base   | 35.4 | 35.9 | 36.2 | 31.1 | 41.7 | 36.3 |
-| AceGPT-13B-base  | <u>42.7</u> | 45.5 | 48.3 | 42.4 | 50.7 | 46.1 | -->
 ## Samples
 #### Sample1(abstract_algebra)

 | GPT-4         | 74.08  | 65.06          | 72.50                | 85.67 | 57.76 | 84.06        | 79.43      |
 <!-- Benchmark evaluation on [Arabic MMLU](https://github.com/FreedomIntelligence/AceGPT) are conducted using accuracy scores as metrics, following the evaluation framework available at https://github.com/FreedomIntelligence/AceGPT/tree/main. -->
+<!-- |                  | STEM | Humanities | Social Sciences | Others | Average |
 |------------------|------|------|------|------|------|
 | Bloomz-7B-base   | 33.35 | 29.29 | 37.58 | 34.53 | 33.69 |
 | LLaMA2-7B-base   | 30.30 | 29.33 | 27.46 | 30.78 | 29.37 |
 | Jais-30B-v1-base | 32.67 | 30.67 | 42.13 | 39.60 | 36.27 |
 | ChatGPT 3.5 Turbo | **43.38** | **44.12** | **55.57** | **53.21** | **49.07** |
+| AceGPT-13B-base  | 36.60 | 38.74 | 43.76 | <u>42.72</u> | 40.45 |
+| AceGPT-7B-base   | 29.73 | 30.95 | 33.45 | 34.42 | 32.14 | -->
+<!-- Benchmark evaluation on [ArabicMMLU]((https://github.com/mbzuai-nlp/ArabicMMLU)), and assessed based on its source settings.
 |                  | STEM | Social Sciences | Humanities | Arabic Language | Other | Average |
 |------------------|------|------|------|------|------|------|
 | Bloomz-7B-base   | - | - | - | - | - | - |
 | Jais-30B-v1-base | 39.5 | 45.6 | <u>50.5</u> | 34.6 | 49.1 | 44.8 |
 | ChatGPT 3.5 Turbo | **53.8** | **57.0** | **57.5** | **57.6** | **63.8** | **57.7** |
+| AceGPT-7B-base   | 35.4 | 35.9 | 36.2 | 31.1 | 41.7 | 36.3 |
+| AceGPT-13B-base  | <u>42.7</u> | 45.5 | 48.3 | 42.4 | 50.7 | 46.1 | --> -->
 ## Samples
 #### Sample1(abstract_algebra)