|
--- |
|
license: cc-by-nc-sa-4.0 |
|
datasets: |
|
- NorGLM/NO-ConvAI2 |
|
language: |
|
- 'no' |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Model Card |
|
|
|
NorGPT-3B-conversation-peft is trained on top of [NorGPT-3B](https://huggingface.co/NorGLM/NorGPT-3B) model on [NO-ConvAI2](https://huggingface.co/datasets/NorGLM/NO-ConvAI2) dataset. |
|
|
|
Prompt format: |
|
``` |
|
Human: {prompt} Robot: |||\n {answer} |
|
``` |
|
|
|
Inference prompt: |
|
``` |
|
Human: {prompt} Robot: |||\n |
|
``` |
|
|
|
## Run the Model |
|
```python |
|
from peft import PeftModel, PeftConfig |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
from tqdm.auto import tqdm |
|
|
|
source_model_id = "NorGLM/NorGPT-3B" |
|
peft_model_id = "NorGLM/NorGPT-3B-conversation-peft" |
|
|
|
config = PeftConfig.from_pretrained(peft_model_id) |
|
model = AutoModelForCausalLM.from_pretrained(source_model_id, device_map='balanced') |
|
|
|
tokenizer_max_len = 2048 |
|
tokenizer_config = {'pretrained_model_name_or_path': source_model_id, |
|
'max_len': tokenizer_max_len} |
|
tokenizer = tokenizer = AutoTokenizer.from_pretrained(**tokenizer_config) |
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
|
model = PeftModel.from_pretrained(model, peft_model_id) |
|
``` |
|
|
|
## Inference Example |
|
Load the model to evaluate on the test set of NO-CNN/DailyMail dataset: |
|
```python |
|
def load_and_prepare_data_last_prompt(df): |
|
""" Load and spearates last prompt from prompt """ |
|
# id, turn_id, prompt, answer |
|
last_prompt = ["Human: " + df['prompt'] |
|
[i].split("Human:")[-1] for i in range(len(df))] |
|
df['last_prompt'] = last_prompt |
|
return df |
|
|
|
def generate_text(text, max_length=200): |
|
# generate with greedy search |
|
model_inputs = tokenizer(text, return_attention_mask=True, return_tensors="pt", |
|
padding=True, truncation=True, max_length=tokenizer_max_len) |
|
|
|
with torch.no_grad(): |
|
output_tokens = model.generate( |
|
**model_inputs, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) |
|
|
|
text_outputs = [tokenizer.decode( |
|
x, skip_special_tokens=True) for x in output_tokens] |
|
|
|
return text_outputs |
|
|
|
print("--LOADING EVAL DATAS---") |
|
eval_data = load_dataset("NorGLM/NO-ConvAI2", data_files="test_PersonaChat_prompt.json") |
|
prompts = eval_data['train']['prompt'] |
|
positive_samples = eval_data['train']['answer'] |
|
|
|
print("--MAKING PREDICTIONS---") |
|
model.eval() |
|
|
|
output_file = <output file name> |
|
generated_text = [] |
|
|
|
for prompt in tqdm(prompts): |
|
generated_text.append(generate_text(prompt, max_length=tokenizer_max_len)) |
|
|
|
df = pd.DataFrame({'prompts':prompts, 'generated_text':generated_text, 'positive_sample':positive_samples}) |
|
|
|
print("Save results to csv file...") |
|
df.to_csv(output_file) |
|
|
|
``` |
|
|
|
## Note |
|
More training details will be released soon! |