Spaces:

BiMediX
/

README

Running

App Files Files Community

HuggingSara commited on Feb 20, 2024

Commit

638744c

verified ·

1 Parent(s): 14fa7ab

Update README.md

Browse files

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ Welcome to the official HuggingFace repository for BiMediX, the bilingual medica
 ## Key Features
-- **Bilingual Support**: Seamless interaction in both English and Arabic for a wide range of medical interactions, including multi-turn chats, multiple-choice question answering, and open-ended question answering.
 - **BiMed1.3M Dataset**: Unique dataset with 1.3 million bilingual medical interactions across English and Arabic, including 250k synthesized multi-turn doctor-patient chats for instruction tuning.
 - **High-Quality Translation** : Utilizes a semi-automated English-to-Arabic translation pipeline with human refinement to ensure accuracy and quality in translations.
 - **Evaluation Benchmark for Arabic Medical LLMs**: Comprehensive benchmark for evaluating Arabic medical language models, setting a new standard in the field.
@@ -71,7 +71,7 @@ The model's fine-tuning process includes a low-rank adaptation technique (QLoRA)
 </div>
-## Dataset
 1. **Compiling English Instruction Set**: The dataset creation began with compiling a dataset in English, covering three types of medical interactions:
@@ -81,7 +81,7 @@ The model's fine-tuning process includes a low-rank adaptation technique (QLoRA)
 2. **Semi-Automated Iterative Translation**: To create high-quality Arabic versions, a semi-automated translation pipeline with human alignment was used.
 4. **Bilingual Benchmark & Instruction Set Creation**: The English medical evaluation benchmarks were translated into Arabic.
-This created a high-quality Arabic medical benchmark, which, combined with the original English benchmarks, formed a bilingual benchmark.
 The BiMed1.3M dataset, resulting from translating 444,995 English samples into Arabic and mixing Arabic and English in a 1:2 ratio, was then used for instruction tuning.
 ## Benchmarks and Performance
@@ -95,7 +95,7 @@ The BiMediX model was evaluated across several benchmarks, demonstrating its eff
    - *Medical MMLU*: A compilation of questions from various medical subjects, requiring broad medical knowledge.
 2. **Results and Comparisons:**
-   - **Bilingual Evaluation**: BiMediX showed superior performance in bilingual (Arabic-English) evaluations, outperforming both the Mixtral-8x7B base model and Jais-30B, a model designed for Arabic. It demonstrated more than 10 and 15 points higher average accuracy, respectively.
    - **Arabic Benchmark**: In Arabic-specific evaluations, BiMediX outperformed Jais-30B in all categories, highlighting the effectiveness of the BiMed1.3M dataset and bilingual training.
    - **English Benchmark**: BiMediX also excelled in English medical benchmarks, surpassing other state-of-the-art models like Med42-70B and Meditron-70B in terms of average performance and efficiency.

 ## Key Features
+- **Bilingual Support**: Seamless interaction in both English and Arabic for a wide range of medical interactions.
 - **BiMed1.3M Dataset**: Unique dataset with 1.3 million bilingual medical interactions across English and Arabic, including 250k synthesized multi-turn doctor-patient chats for instruction tuning.
 - **High-Quality Translation** : Utilizes a semi-automated English-to-Arabic translation pipeline with human refinement to ensure accuracy and quality in translations.
 - **Evaluation Benchmark for Arabic Medical LLMs**: Comprehensive benchmark for evaluating Arabic medical language models, setting a new standard in the field.
 </div>
+## Data
 1. **Compiling English Instruction Set**: The dataset creation began with compiling a dataset in English, covering three types of medical interactions:
 2. **Semi-Automated Iterative Translation**: To create high-quality Arabic versions, a semi-automated translation pipeline with human alignment was used.
 4. **Bilingual Benchmark & Instruction Set Creation**: The English medical evaluation benchmarks were translated into Arabic.
+This created a high-quality Arabic medical benchmark, and combined with the original English benchmarks, formed a bilingual benchmark.
 The BiMed1.3M dataset, resulting from translating 444,995 English samples into Arabic and mixing Arabic and English in a 1:2 ratio, was then used for instruction tuning.
 ## Benchmarks and Performance
    - *Medical MMLU*: A compilation of questions from various medical subjects, requiring broad medical knowledge.
 2. **Results and Comparisons:**
+   - **Bilingual Evaluation**: BiMediX showed superior performance in bilingual (Arabic-English) evaluations, outperforming both the Mixtral-8x7B base model and Jais-30B. It demonstrated more than 10 and 15 points higher average accuracy, respectively.
    - **Arabic Benchmark**: In Arabic-specific evaluations, BiMediX outperformed Jais-30B in all categories, highlighting the effectiveness of the BiMed1.3M dataset and bilingual training.
    - **English Benchmark**: BiMediX also excelled in English medical benchmarks, surpassing other state-of-the-art models like Med42-70B and Meditron-70B in terms of average performance and efficiency.