File size: 1,312 Bytes
e5f8590 43a0492 764e561 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
---
language:
- zh
license: apache-2.0
datasets:
- TencentKuaibao
metrics:
- bleu
- rouge
---
## 模型
- 基于中文[MengziT5](https://huggingface.co/Langboat/mengzi-t5-base)的新闻评论生成模型
- 数据集来源于论文[《Coherent Comment Generation for Chinese Articles with a Graph-to-Sequence Model》](https://github.com/lancopku/Graph-to-seq-comment-generation)
## 生成评论
- 在线API只能生成一种评论,模型通过设置model.generate()参数是可以生成多种评论的
```Python
t5_tokenizer = T5Tokenizer.from_pretrained("Langboat/mengzi-t5-base")
model = T5ForConditionalGeneration.from_pretrained("wawaup/MengziT5-Comment")
def generate_comment(input_ids,cnt_num):
outputs = model.generate(input_ids,
max_length=128,
do_sample=True,
temperature=0.9,
early_stopping=True,
repetition_penalty=10.0,
top_p=0.5,
num_return_sequences=cnt_num)
print(outputs)
preds_cleaned = [t5_tokenizer.decode(ids, skip_special_tokens=True,
clean_up_tokenization_spaces=True) for ids in outputs]
print(preds_cleaned)
return preds_cleaned
``` |