File size: 1,194 Bytes
5a3355b fa481ab 5a3355b fa481ab 5a3355b |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
## Test Set Details
The test set used for evaluation is composed of 1000 sentences geolocated to the 14 most-populated Arab countries (excluding Somalia from which data was scarce). Each sample is annotated by native speakers recruited from 11 different Arab countries, namely: Algeria, Egypt, Iraq, Jordan, Morocco, Palestine, Saudi Arabia, Sudan, Syria, Tunisia, Yemen.
## Evaluation Metrics
We compute the precision, recall, and F1 scores for each of the 11 countries (treating each label as a binary classification problem).
## Data Access
If you need to access the single-label training sets, and the multi-label development set, please fill the following form: https://forms.gle/t3QTC6ZqyDJBzAau8
#### Further Notes
* The beta version of the leaderboard is running on limited resources, and is not able to evaluate models with a relatively large number of parameters.
* Please refer to the [paper](https://aclanthology.org/2024.arabicnlp-1.79/) for more information about how the data was curated and annotated.
* We are planning to extend the annotations to include more country-level dialects. If you are interested in helping, please ping us, and we are happy to discuss it further. |