HuggingSara commited on
Commit
638744c
·
verified ·
1 Parent(s): 14fa7ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -14,7 +14,7 @@ Welcome to the official HuggingFace repository for BiMediX, the bilingual medica
14
 
15
  ## Key Features
16
 
17
- - **Bilingual Support**: Seamless interaction in both English and Arabic for a wide range of medical interactions, including multi-turn chats, multiple-choice question answering, and open-ended question answering.
18
  - **BiMed1.3M Dataset**: Unique dataset with 1.3 million bilingual medical interactions across English and Arabic, including 250k synthesized multi-turn doctor-patient chats for instruction tuning.
19
  - **High-Quality Translation** : Utilizes a semi-automated English-to-Arabic translation pipeline with human refinement to ensure accuracy and quality in translations.
20
  - **Evaluation Benchmark for Arabic Medical LLMs**: Comprehensive benchmark for evaluating Arabic medical language models, setting a new standard in the field.
@@ -71,7 +71,7 @@ The model's fine-tuning process includes a low-rank adaptation technique (QLoRA)
71
  </div>
72
 
73
 
74
- ## Dataset
75
 
76
  1. **Compiling English Instruction Set**: The dataset creation began with compiling a dataset in English, covering three types of medical interactions:
77
 
@@ -81,7 +81,7 @@ The model's fine-tuning process includes a low-rank adaptation technique (QLoRA)
81
 
82
  2. **Semi-Automated Iterative Translation**: To create high-quality Arabic versions, a semi-automated translation pipeline with human alignment was used.
83
  4. **Bilingual Benchmark & Instruction Set Creation**: The English medical evaluation benchmarks were translated into Arabic.
84
- This created a high-quality Arabic medical benchmark, which, combined with the original English benchmarks, formed a bilingual benchmark.
85
  The BiMed1.3M dataset, resulting from translating 444,995 English samples into Arabic and mixing Arabic and English in a 1:2 ratio, was then used for instruction tuning.
86
 
87
  ## Benchmarks and Performance
@@ -95,7 +95,7 @@ The BiMediX model was evaluated across several benchmarks, demonstrating its eff
95
  - *Medical MMLU*: A compilation of questions from various medical subjects, requiring broad medical knowledge.
96
 
97
  2. **Results and Comparisons:**
98
- - **Bilingual Evaluation**: BiMediX showed superior performance in bilingual (Arabic-English) evaluations, outperforming both the Mixtral-8x7B base model and Jais-30B, a model designed for Arabic. It demonstrated more than 10 and 15 points higher average accuracy, respectively.
99
  - **Arabic Benchmark**: In Arabic-specific evaluations, BiMediX outperformed Jais-30B in all categories, highlighting the effectiveness of the BiMed1.3M dataset and bilingual training.
100
  - **English Benchmark**: BiMediX also excelled in English medical benchmarks, surpassing other state-of-the-art models like Med42-70B and Meditron-70B in terms of average performance and efficiency.
101
 
 
14
 
15
  ## Key Features
16
 
17
+ - **Bilingual Support**: Seamless interaction in both English and Arabic for a wide range of medical interactions.
18
  - **BiMed1.3M Dataset**: Unique dataset with 1.3 million bilingual medical interactions across English and Arabic, including 250k synthesized multi-turn doctor-patient chats for instruction tuning.
19
  - **High-Quality Translation** : Utilizes a semi-automated English-to-Arabic translation pipeline with human refinement to ensure accuracy and quality in translations.
20
  - **Evaluation Benchmark for Arabic Medical LLMs**: Comprehensive benchmark for evaluating Arabic medical language models, setting a new standard in the field.
 
71
  </div>
72
 
73
 
74
+ ## Data
75
 
76
  1. **Compiling English Instruction Set**: The dataset creation began with compiling a dataset in English, covering three types of medical interactions:
77
 
 
81
 
82
  2. **Semi-Automated Iterative Translation**: To create high-quality Arabic versions, a semi-automated translation pipeline with human alignment was used.
83
  4. **Bilingual Benchmark & Instruction Set Creation**: The English medical evaluation benchmarks were translated into Arabic.
84
+ This created a high-quality Arabic medical benchmark, and combined with the original English benchmarks, formed a bilingual benchmark.
85
  The BiMed1.3M dataset, resulting from translating 444,995 English samples into Arabic and mixing Arabic and English in a 1:2 ratio, was then used for instruction tuning.
86
 
87
  ## Benchmarks and Performance
 
95
  - *Medical MMLU*: A compilation of questions from various medical subjects, requiring broad medical knowledge.
96
 
97
  2. **Results and Comparisons:**
98
+ - **Bilingual Evaluation**: BiMediX showed superior performance in bilingual (Arabic-English) evaluations, outperforming both the Mixtral-8x7B base model and Jais-30B. It demonstrated more than 10 and 15 points higher average accuracy, respectively.
99
  - **Arabic Benchmark**: In Arabic-specific evaluations, BiMediX outperformed Jais-30B in all categories, highlighting the effectiveness of the BiMed1.3M dataset and bilingual training.
100
  - **English Benchmark**: BiMediX also excelled in English medical benchmarks, surpassing other state-of-the-art models like Med42-70B and Meditron-70B in terms of average performance and efficiency.
101