|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- zh |
|
inference: false |
|
--- |
|
# SeqGPT-560M |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This is SeqGPT-560M weight, a compact model targeting open-domain Natural Language Understanding (NLU). We refer you to our github [repo](https://github.com/Alibaba-NLP/SeqGPT) for more details. |
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
The model is fine-tuned based on [BLOOMZ-560M](https://huggingface.co/bigscience/bloomz-560m). |
|
|
|
### Model Sources |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [SeqGPT](https://github.com/Alibaba-NLP/SeqGPT) |
|
- **Paper:** [arxiv](https://arxiv.org/abs/2308.10529) |
|
- **Demo:** [demo](https://www.modelscope.cn/studios/TTCoding/open_ner/summary) |
|
|
|
## Uses |
|
|
|
|
|
```py |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel |
|
|
|
model_name_or_path = 'DAMO-NLP/SeqGPT-560M' |
|
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path) |
|
model = AutoModelForCausalLM.from_pretrained(model_name_or_path) |
|
tokenizer.padding_side = 'left' |
|
tokenizer.truncation_side = 'left' |
|
|
|
if torch.cuda.is_available(): |
|
model = model.half().cuda() |
|
model.eval() |
|
GEN_TOK = '[GEN]' |
|
|
|
while True: |
|
sent = input('输入/Input: ').strip() |
|
task = input('分类/classify press 1, 抽取/extract press 2: ').strip() |
|
labels = input('标签集/Label-Set (e.g, labelA,LabelB,LabelC): ').strip().replace(',', ',') |
|
task = '分类' if task == '1' else '抽取' |
|
|
|
# Changing the instruction can harm the performance |
|
p = '输入: {}\n{}: {}\n输出: {}'.format(sent, task, labels, GEN_TOK) |
|
input_ids = tokenizer(p, return_tensors="pt", padding=True, truncation=True, max_length=1024) |
|
input_ids = input_ids.to(model.device) |
|
outputs = model.generate(**input_ids, num_beams=4, do_sample=False, max_new_tokens=256) |
|
input_ids = input_ids.get('input_ids', input_ids) |
|
outputs = outputs[0][len(input_ids[0]):] |
|
response = tokenizer.decode(outputs, skip_special_tokens=True) |
|
print('BOT: ========== \n{}'.format(response)) |
|
``` |