|
--- |
|
language: |
|
- multilingual |
|
- ar |
|
- bn |
|
- de |
|
- el |
|
- en |
|
- es |
|
- fi |
|
- fr |
|
- hi |
|
- id |
|
- it |
|
- ja |
|
- ko |
|
- nl |
|
- pl |
|
- pt |
|
- ru |
|
- sv |
|
- sw |
|
- te |
|
- th |
|
- tr |
|
- vi |
|
- zh |
|
thumbnail: https://github.com/studio-ousia/luke/raw/master/resources/luke_logo.png |
|
tags: |
|
- luke |
|
- named entity recognition |
|
- relation classification |
|
- question answering |
|
license: apache-2.0 |
|
--- |
|
|
|
## mLUKE |
|
|
|
**mLUKE** (multilingual LUKE) is a multilingual extension of LUKE. |
|
|
|
Please check the [official repository](https://github.com/studio-ousia/luke) for |
|
more details and updates. |
|
|
|
This is the mLUKE base model with 12 hidden layers, 768 hidden size. The total number |
|
of parameters in this model is 279M. |
|
The model was initialized with the weights of XLM-RoBERTa(base) and trained using December 2020 version of Wikipedia in 24 languages. |
|
|
|
This model is a lite-weight version of [studio-ousia/mluke-base](https://huggingface.co/studio-ousia/mluke-base), without Wikipedia entity embeddings but only with special entities such as `[MASK]`. |
|
|
|
## Note |
|
When you load the model from `AutoModel.from_pretrained` with the default configuration, you will see the following warning: |
|
|
|
``` |
|
Some weights of the model checkpoint at studio-ousia/mluke-base-lite were not used when initializing LukeModel: [ |
|
'luke.encoder.layer.0.attention.self.w2e_query.weight', 'luke.encoder.layer.0.attention.self.w2e_query.bias', |
|
'luke.encoder.layer.0.attention.self.e2w_query.weight', 'luke.encoder.layer.0.attention.self.e2w_query.bias', |
|
'luke.encoder.layer.0.attention.self.e2e_query.weight', 'luke.encoder.layer.0.attention.self.e2e_query.bias', |
|
...] |
|
``` |
|
|
|
These weights are the weights for entity-aware attention (as described in [the LUKE paper](https://arxiv.org/abs/2010.01057)). |
|
This is expected because `use_entity_aware_attention` is set to `false` by default, but the pretrained weights contain the weights for it in case you enable `use_entity_aware_attention` and have the weights loaded into the model. |
|
|
|
### Citation |
|
|
|
If you find mLUKE useful for your work, please cite the following paper: |
|
|
|
```latex |
|
@inproceedings{ri-etal-2022-mluke, |
|
title = "m{LUKE}: {T}he Power of Entity Representations in Multilingual Pretrained Language Models", |
|
author = "Ri, Ryokan and |
|
Yamada, Ikuya and |
|
Tsuruoka, Yoshimasa", |
|
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", |
|
year = "2022", |
|
url = "https://aclanthology.org/2022.acl-long.505", |
|
``` |
|
|