Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,18 @@ The pre-trained CafeBERT model is the state-of-the-art language model for Vietna
|
|
15 |
|
16 |
CafeBERT is a large-scale multilingual language model with strong support for Vietnamese. The model is based on XLM-Roberta (the state-of-the-art multilingual language model) and is enhanced with a large Vietnamese corpus with many domains: Wikipedia, newspapers... CafeBERT has outstanding performance on the VLUE benchmark and other tasks, such as machine reading comprehension, text classification, natural language inference, part-of-speech tagging...
|
17 |
|
18 |
-
The general architecture and experimental results of PhoBERT can be found in our paper:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
Please **CITE** our paper when CafeBERT is used to help produce published results or is incorporated into other software.
|
21 |
|
|
|
15 |
|
16 |
CafeBERT is a large-scale multilingual language model with strong support for Vietnamese. The model is based on XLM-Roberta (the state-of-the-art multilingual language model) and is enhanced with a large Vietnamese corpus with many domains: Wikipedia, newspapers... CafeBERT has outstanding performance on the VLUE benchmark and other tasks, such as machine reading comprehension, text classification, natural language inference, part-of-speech tagging...
|
17 |
|
18 |
+
The general architecture and experimental results of PhoBERT can be found in our [paper](https://arxiv.org/abs/2403.15882):
|
19 |
+
|
20 |
+
```
|
21 |
+
@misc{do2024vlue,
|
22 |
+
title={VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding},
|
23 |
+
author={Phong Nguyen-Thuan Do and Son Quoc Tran and Phu Gia Hoang and Kiet Van Nguyen and Ngan Luu-Thuy Nguyen},
|
24 |
+
year={2024},
|
25 |
+
eprint={2403.15882},
|
26 |
+
archivePrefix={arXiv},
|
27 |
+
primaryClass={cs.CL}
|
28 |
+
}
|
29 |
+
```
|
30 |
|
31 |
Please **CITE** our paper when CafeBERT is used to help produce published results or is incorporated into other software.
|
32 |
|