Update README.md
Browse files
README.md
CHANGED
@@ -49,7 +49,9 @@ language:
|
|
49 |
- Paper: https://aclanthology.org/2022.findings-emnlp.499.pdf
|
50 |
- Repository: https://github.com/huawei-noah/noah-research/tree/master/NLP/EntityCS
|
51 |
- Point of Contact: [Fenia Christopoulou](mailto:[email protected]), [Chenxi Whitehouse](mailto:[email protected])
|
52 |
-
|
|
|
|
|
53 |
This model has been trained on the EntityCS corpus, an English corpus from Wikipedia with replaced entities in different languages.
|
54 |
The corpus can be found in [https://huggingface.co/huawei-noah/entity_cs](https://huggingface.co/huawei-noah/entity_cs), check the link for more details.
|
55 |
To train models on the corpus, we first employ the conventional 80-10-10 MLM objective, where 15% of sentence subwords are considered as masking candidates. From those, we replace subwords
|
@@ -105,7 +107,7 @@ In the paper, we focused on entity-related tasks, such as NER, Word Sense Disamb
|
|
105 |
|
106 |
Alternatively, it can be used directly (no fine-tuning) for probing tasks, i.e. predict missing words, such as [X-FACTR](https://aclanthology.org/2020.emnlp-main.479/).
|
107 |
|
108 |
-
For results on each downstream task, please refer to the paper.
|
109 |
|
110 |
|
111 |
## How to Get Started with the Model
|
@@ -114,7 +116,7 @@ Use the code below to get started with the model: https://github.com/huawei-noah
|
|
114 |
|
115 |
## Citation
|
116 |
|
117 |
-
**BibTeX
|
118 |
|
119 |
```html
|
120 |
@inproceedings{whitehouse-etal-2022-entitycs,
|
@@ -132,8 +134,8 @@ Use the code below to get started with the model: https://github.com/huawei-noah
|
|
132 |
}
|
133 |
```
|
134 |
|
135 |
-
**APA
|
136 |
|
137 |
```html
|
138 |
-
Whitehouse, C., Christopoulou, F., & Iacobacci, I. (2022). EntityCS: Improving Zero-Shot Cross-lingual Transfer with Entity-Centric Code Switching. In Findings of the Association for Computational Linguistics: EMNLP 2022
|
139 |
```
|
|
|
49 |
- Paper: https://aclanthology.org/2022.findings-emnlp.499.pdf
|
50 |
- Repository: https://github.com/huawei-noah/noah-research/tree/master/NLP/EntityCS
|
51 |
- Point of Contact: [Fenia Christopoulou](mailto:[email protected]), [Chenxi Whitehouse](mailto:[email protected])
|
52 |
+
|
53 |
+
## Model Description
|
54 |
+
|
55 |
This model has been trained on the EntityCS corpus, an English corpus from Wikipedia with replaced entities in different languages.
|
56 |
The corpus can be found in [https://huggingface.co/huawei-noah/entity_cs](https://huggingface.co/huawei-noah/entity_cs), check the link for more details.
|
57 |
To train models on the corpus, we first employ the conventional 80-10-10 MLM objective, where 15% of sentence subwords are considered as masking candidates. From those, we replace subwords
|
|
|
107 |
|
108 |
Alternatively, it can be used directly (no fine-tuning) for probing tasks, i.e. predict missing words, such as [X-FACTR](https://aclanthology.org/2020.emnlp-main.479/).
|
109 |
|
110 |
+
For results on each downstream task, please refer to the [paper](https://aclanthology.org/2022.findings-emnlp.499.pdf).
|
111 |
|
112 |
|
113 |
## How to Get Started with the Model
|
|
|
116 |
|
117 |
## Citation
|
118 |
|
119 |
+
**BibTeX**
|
120 |
|
121 |
```html
|
122 |
@inproceedings{whitehouse-etal-2022-entitycs,
|
|
|
134 |
}
|
135 |
```
|
136 |
|
137 |
+
**APA**
|
138 |
|
139 |
```html
|
140 |
+
Whitehouse, C., Christopoulou, F., & Iacobacci, I. (2022). EntityCS: Improving Zero-Shot Cross-lingual Transfer with Entity-Centric Code Switching. In Findings of the Association for Computational Linguistics: EMNLP 2022.
|
141 |
```
|