|
--- |
|
license: mit |
|
datasets: |
|
- Slim205/total_data_baraka_ift |
|
language: |
|
- ar |
|
base_model: |
|
- google/gemma-2-2b-it |
|
--- |
|
|
|
The goal of this project is to adapt large language models for the Arabic language. Due to the scarcity of Arabic instruction fine-tuning data, the focus is on creating a high-quality instruction fine-tuning (IFT) dataset. The project aims to finetune models on this dataset and evaluate their performance across various benchmarks. |
|
|
|
This model is the 2B version. It was trained for 2 days on 1 A100 GPU using LoRA with a rank of 128, a learning rate of 1e-4, and a cosine learning rate schedule. |
|
|
|
| Metric | Slim205/Barka-2b-it | |
|
|----------------------|---------------------| |
|
| Average | 46.98 | |
|
| ACVA | 39.5 | |
|
| AlGhafa | 46.5 | |
|
| MMLU | 37.06 | |
|
| EXAMS | 38.73 | |
|
| ARC Challenge | 35.78 | |
|
| ARC Easy | 36.97 | |
|
| BOOLQ | 73.77 | |
|
| COPA | 50 | |
|
| HELLAWSWAG | 28.98 | |
|
| OPENBOOK QA | 43.84 | |
|
| PIQA | 56.36 | |
|
| RACE | 36.19 | |
|
| SCIQ | 55.78 | |
|
| TOXIGEN | 78.29 | |
|
|