language:
- en
inference: false
license: apache-2.0
YuyuanQA-3.5B model (Medical),one model of Fengshenbang-LM.
YuyuanQA-3.5B is fine-tuned with 10000 medical QA pairs based on Yuyuan-3.5B model.
Question answering(QA) is an important subject related to natural language processing and information retrieval. There are many application scenarios in the actual industry. Traditional methods are often complex, and their core algorithms involve machine learning, deep learning and knowledge graph related knowledge.
We hope to explore a simpler and more effective way to use the powerful memory and understanding ability of the large model to directly realize question and answer. Yuyuanqa-3.5b model is an attempt and performs well under subjective test.At the same time, we also tested 100 QA pairs with blue:
gram | 1-gram | 2-gram | 3-gram | 4-gram |
---|---|---|---|---|
blue_score | 0.357727 | 0.2713 | 0.22304 | 0.19099 |
Usage
load model
from transformers import GPT2Tokenizer,GPT2LMHeadModel
hf_model_path = 'model_path'
tokenizer = GPT2Tokenizer.from_pretrained(hf_model_path)
model = GPT2LMHeadModel.from_pretrained(hf_model_path)
generation
fquestion = "What should gout patients pay attention to in diet?"
inputs = tokenizer(f'Question:{question} answer:',return_tensors='pt')
generation_output = model.generate(**inputs,
return_dict_in_generate=True,
output_scores=True,
max_length=150,
# max_new_tokens=80,
do_sample=True,
top_p = 0.6,
eos_token_id=50256,
pad_token_id=0,
num_return_sequences = 5)
for idx,sentence in enumerate(generation_output.sequences):
print('next sentence %d:\n'%idx,
tokenizer.decode(sentence).split('<|endoftext|>')[0])
print('*'*40)
example
Citation
If you find the resource is useful, please cite the following website in your paper.
@misc{Fengshenbang-LM,
title={Fengshenbang-LM},
author={IDEA-CCNL},
year={2022},
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}