UBC-NLP
/

ARBERT

@@ -1,12 +1,12 @@
 <img src="https://raw.githubusercontent.com/UBC-NLP/marbert/main/ARBERT_MARBERT.jpg" alt="drawing" width="30%" height="30%" align="right"/>
-**ARBERT** is one of three models described in our ACl 2021 paper ["ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic"](https://mageed.arts.ubc.ca/files/2020/12/marbert_arxiv_2020.pdf). ARBERT is a large-scale pre-trained masked language model focused on Modern Standard Arabic (MSA). To train ARBERT, we use the same architecture as BERT-base: 12 attention layers, each has 12 attention heads and 768 hidden dimensions, a vocabulary of 100K WordPieces, making ∼163M parameters. We train ARBERT on a collection of Arabic datasets comprising **61GB of text** (**6.2B tokens**). For more information, please visit our own GitHub [repo](https://github.com/UBC-NLP/marbert).
 # BibTex
-If you use our models (ARBERT, MARBERT, or MARBERT_V2) for your scientific publication, or if you find the resources in this repository useful, please cite our paper as follows (to be updated):
 ```bibtex
 @inproceedings{abdul-mageed-etal-2021-arbert,
     title = "{ARBERT} {\&} {MARBERT}: Deep Bidirectional Transformers for {A}rabic",

 <img src="https://raw.githubusercontent.com/UBC-NLP/marbert/main/ARBERT_MARBERT.jpg" alt="drawing" width="30%" height="30%" align="right"/>
+**ARBERT** is one of three models described in our **ACl 2021 paper** **["ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic"](https://mageed.arts.ubc.ca/files/2020/12/marbert_arxiv_2020.pdf)**. ARBERT is a large-scale pre-trained masked language model focused on Modern Standard Arabic (MSA). To train ARBERT, we use the same architecture as BERT-base: 12 attention layers, each has 12 attention heads and 768 hidden dimensions, a vocabulary of 100K WordPieces, making ∼163M parameters. We train ARBERT on a collection of Arabic datasets comprising **61GB of text** (**6.2B tokens**). For more information, please visit our own GitHub [repo](https://github.com/UBC-NLP/marbert).
 # BibTex
+If you use our models (ARBERT, MARBERT, or MARBERTv2) for your scientific publication, or if you find the resources in this repository useful, please cite our paper as follows (to be updated):
 ```bibtex
 @inproceedings{abdul-mageed-etal-2021-arbert,
     title = "{ARBERT} {\&} {MARBERT}: Deep Bidirectional Transformers for {A}rabic",