ikuyamada commited on
Commit
f650a63
·
1 Parent(s): c349af2

update README

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md CHANGED
@@ -1,3 +1,59 @@
1
  ---
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: ja
3
+ thumbnail: https://github.com/studio-ousia/luke/raw/master/resources/luke_logo.png
4
+ tags:
5
+ - luke
6
+ - named entity recognition
7
+ - entity typing
8
+ - relation classification
9
+ - question answering
10
  license: apache-2.0
11
  ---
12
+
13
+ ## luke-japanese-large
14
+
15
+ **luke-japanese** is the Japanese version of **LUKE** (**L**anguage
16
+ **U**nderstanding with **K**nowledge-based **E**mbeddings), a pre-trained
17
+ _knowledge-enhanced_ contextualized representation of words and entities. LUKE
18
+ treats words and entities in a given text as independent tokens, and outputs
19
+ contextualized representations of them. Please refer to our
20
+ [GitHub repository](https://github.com/studio-ousia/luke) for more details and
21
+ updates.
22
+
23
+ This model contains Wikipedia entity embeddings which are not used in general
24
+ NLP tasks. Please use the
25
+ [lite version](https://huggingface.co/studio-ousia/luke-japanese-large-lite/)
26
+ for tasks that do not use Wikipedia entities as inputs.
27
+
28
+ **luke-japanese**は、単語とエンティティの知識拡張型訓練済み Transformer モデル**LUKE**の日本語版です。LUKE は単語とエンティティを独立したトークンとして扱い、これらの文脈を考慮した表現を出力します。詳細については、[GitHub リポジトリ](https://github.com/studio-ousia/luke)を参照してください。
29
+
30
+ このモデルは、通常の NLP タスクでは使われない Wikipedia エンティティのエンベディングを含んでいます。単語の入力のみを使うタスクには、[lite version](https://huggingface.co/studio-ousia/luke-japanese-large-lite/)を使用してください。
31
+
32
+ ### Experimental results on JGLUE
33
+
34
+ The experimental results evaluated on the dev set of
35
+ [JGLUE](https://github.com/yahoojapan/JGLUE) is shown as follows:
36
+
37
+ | Model | MARC-ja | JSTS | JNLI | JCommonsenseQA |
38
+ | ----------------------------- | --------- | ------------------- | --------- | -------------- |
39
+ | | acc | Pearson/Spearman | acc | acc |
40
+ | **LUKE Japanese large** | **0.965** | **0.932**/**0.902** | **0.927** | 0.893 |
41
+ | _Baselines:_ | |
42
+ | Tohoku BERT large | 0.955 | 0.913/0.872 | 0.900 | 0.816 |
43
+ | Waseda RoBERTa large (seq128) | 0.954 | 0.930/0.896 | 0.924 | **0.907** |
44
+ | Waseda RoBERTa large (seq512) | 0.961 | 0.926/0.892 | 0.926 | 0.891 |
45
+ | XLM RoBERTa large | 0.964 | 0.918/0.884 | 0.919 | 0.840 |
46
+
47
+ The baseline scores are obtained from
48
+ [here](https://github.com/yahoojapan/JGLUE/blob/a6832af23895d6faec8ecf39ec925f1a91601d62/README.md).
49
+
50
+ ### Citation
51
+
52
+ ```latex
53
+ @inproceedings{yamada2020luke,
54
+ title={LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention},
55
+ author={Ikuya Yamada and Akari Asai and Hiroyuki Shindo and Hideaki Takeda and Yuji Matsumoto},
56
+ booktitle={EMNLP},
57
+ year={2020}
58
+ }
59
+ ```