File size: 965 Bytes
70c07dc 6673d15 70c07dc cc527ae 384e938 3b28145 bdd837a 264ea79 8981ee0 3b28145 d1ff48f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
---
license: gpl-2.0
language: ar
---
A model which is jointly trained and fine-tuned on Quran, Saheefa and nahj-al-balaqa. All Datasets are available [Here](https://github.com/language-ml/course-nlp-ir-1-text-exploring/tree/main/exploring-datasets/religious_text). Code will be available soon ...
Some Examples for filling the mask:
- ```
ุฐููููู [MASK] ููุง ุฑูููุจู ููููู ููุฏูู ููููู
ูุชููููููู
```
- ```
ููุง ุฃููููููุง ุงููููุงุณู ุงุนูุจูุฏููุง ุฑูุจููููู
ู ุงูููุฐูู ุฎูููููููู
ู ููุงูููุฐูููู ู
ููู ููุจูููููู
ู ููุนููููููู
ู [MASK]
```
This model is fine-tuned on [Bert Base Arabic](https://huggingface.co/asafaya/bert-base-arabic) for 30 epochs. We have used `Masked Language Modeling` to fine-tune the model. Also, after each 5 epochs, we have completely masked the words again for the model to learn the embeddings very well and not overfit the data.
|