模型介绍

  • 使用模型:LLaMA2-7B
  • 微调方法:QLoRA
  • 数据集:databricks/databricks-dolly-15k
  • 显卡:一张RTX4090
  • 目标:对模型进行指令微调

使用方法

  • 加载数据
from datasets import load_dataset 
from random import randrange
 
 
# 从hub加载数据集并得到一个样本
dataset = load_dataset("databricks/databricks-dolly-15k", split="train")
sample = dataset[randrange(len(dataset))]
  • 模型使用
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name_or_path = "snowfly/llama2-7b-QLoRA-dolly"
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name_or_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path=model_name_or_path, 
                                  trust_remote_code=True,
                                  low_cpu_mem_usage=True,
                                  torch_dtype=torch.float16,
                                  load_in_4bit=True)
model = model.eval()


prompt = f"""### Instruction:
Use the Input below to create an instruction, which could have been used to generate the input using an LLM. 
 
### Input:
{sample['response']}
 
### Response:
"""
 
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()

outputs = model.generate(input_ids=input_ids, max_new_tokens=100, do_sample=True, top_p=0.9,temperature=0.9)

print(f"Prompt:\n{sample['response']}\n")
print(f"Generated instruction:\n{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):]}")
print(f"Ground truth:\n{sample['instruction']}")
Downloads last month
28
Safetensors
Model size
6.74B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train snowfly/llama2-7b-QLoRA-dolly