|
rebel-base-chinese-cndbpedia is a generation-based relation extraction model |
|
|
|
·a SOTA chinese end-to-end relation extraction model,using bart as backbone. |
|
|
|
·using the training method of <REBEL:Relation Extraction By End-to-end Language generation>(EMNLP Findings 2021). |
|
|
|
·using the Distant-supervised data from cndbpedia,pretrained from the checkpoint of fnlp/bart-base-chinese. |
|
|
|
·can perform SOTA in many chinese relation extraction dataset,such as lic2019,lic2020,HacRED,etc. |
|
|
|
·easy to use,just like normal generation task. |
|
|
|
·input is sentence,and output is linearlize triples,such as input:姚明是一名NBA篮球运动员 output:[subj]姚明[obj]NBA[rel]公司[obj]篮球运动员[rel]职业(more details can read on REBEL paper) |
|
|
|
using model: |
|
|
|
from transformers import BertTokenizer, BartForConditionalGeneration |
|
# load tokenizer |
|
model_name = 'fnlp/bart-base-chinese' |
|
|
|
tokenizer_kwargs = { |
|
"use_fast": True, |
|
"additional_special_tokens": ['<rel>', '<obj>', '<subj>'], |
|
} |
|
|
|
tokenizer = BertTokenizer.from_pretrained(model_name, **tokenizer_kwargs) |
|
|
|
#using model |
|
model = BartForConditionalGeneration.from_pretrained("fanxiao/rebel-base-chinese-cndbpedia") |