KoichiYasuoka
commited on
Commit
·
750f3fc
1
Parent(s):
1e51901
dependency-parsing
Browse files
README.md
CHANGED
@@ -7,6 +7,7 @@ tags:
|
|
7 |
- "ancient chinese"
|
8 |
- "token-classification"
|
9 |
- "pos"
|
|
|
10 |
datasets:
|
11 |
- "universal_dependencies"
|
12 |
license: "apache-2.0"
|
@@ -19,7 +20,7 @@ widget:
|
|
19 |
|
20 |
## Model Description
|
21 |
|
22 |
-
This is a RoBERTa model pre-trained on Classical Chinese texts for POS-tagging, derived from [roberta-classical-chinese-base-char](https://huggingface.co/KoichiYasuoka/roberta-classical-chinese-base-char). Every word is tagged by [UPOS](https://universaldependencies.org/u/pos/) (Universal Part-Of-Speech).
|
23 |
|
24 |
## How to Use
|
25 |
|
@@ -28,11 +29,16 @@ import torch
|
|
28 |
from transformers import AutoTokenizer,AutoModelForTokenClassification
|
29 |
tokenizer=AutoTokenizer.from_pretrained("KoichiYasuoka/roberta-classical-chinese-base-upos")
|
30 |
model=AutoModelForTokenClassification.from_pretrained("KoichiYasuoka/roberta-classical-chinese-base-upos")
|
31 |
-
s="子曰學而時習之不亦説乎有朋自遠方來不亦樂乎人不知而不慍不亦君子乎"
|
32 |
-
p=[model.config.id2label[q] for q in torch.argmax(model(tokenizer.encode(s,return_tensors="pt"))["logits"],dim=2)[0].tolist()[1:-1]]
|
33 |
-
print(list(zip(s,p)))
|
34 |
```
|
35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
## Reference
|
37 |
|
38 |
Koichi Yasuoka: [Universal Dependencies Treebank of the Four Books in Classical Chinese](http://hdl.handle.net/2433/245217), DADH2019: 10th International Conference of Digital Archives and Digital Humanities (December 2019), pp.20-28.
|
|
|
7 |
- "ancient chinese"
|
8 |
- "token-classification"
|
9 |
- "pos"
|
10 |
+
- "dependency-parsing"
|
11 |
datasets:
|
12 |
- "universal_dependencies"
|
13 |
license: "apache-2.0"
|
|
|
20 |
|
21 |
## Model Description
|
22 |
|
23 |
+
This is a RoBERTa model pre-trained on Classical Chinese texts for POS-tagging and dependency-parsing, derived from [roberta-classical-chinese-base-char](https://huggingface.co/KoichiYasuoka/roberta-classical-chinese-base-char). Every word is tagged by [UPOS](https://universaldependencies.org/u/pos/) (Universal Part-Of-Speech).
|
24 |
|
25 |
## How to Use
|
26 |
|
|
|
29 |
from transformers import AutoTokenizer,AutoModelForTokenClassification
|
30 |
tokenizer=AutoTokenizer.from_pretrained("KoichiYasuoka/roberta-classical-chinese-base-upos")
|
31 |
model=AutoModelForTokenClassification.from_pretrained("KoichiYasuoka/roberta-classical-chinese-base-upos")
|
|
|
|
|
|
|
32 |
```
|
33 |
|
34 |
+
or
|
35 |
+
|
36 |
+
```py
|
37 |
+
import esupar
|
38 |
+
nlp=esupar.load("KoichiYasuoka/bert-base-japanese-upos")
|
39 |
+
```
|
40 |
+
|
41 |
+
|
42 |
## Reference
|
43 |
|
44 |
Koichi Yasuoka: [Universal Dependencies Treebank of the Four Books in Classical Chinese](http://hdl.handle.net/2433/245217), DADH2019: 10th International Conference of Digital Archives and Digital Humanities (December 2019), pp.20-28.
|