File size: 3,690 Bytes
b840854 549c9d3 b840854 ac7defa d95179e ac7defa d95179e ff07c97 d95179e ff07c97 778289e ac7defa 778289e d95179e ac7defa d95179e ac7defa d95179e b840854 d95179e ac7defa d95179e ac7defa d95179e ac7defa d95179e ac7defa d95179e 549c9d3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
---
inference: true
license: apache-2.0
datasets:
- bitext/customer-support-intent-dataset
pipeline_tag: text-generation
---
# longchat-7b-qlora-customer-support Model Card
This repo contains the 4-bit LORA (low-rank) adapter weights for [longchat-7b-16k model](https://huggingface.co/lmsys/longchat-7b-16k), fine-tuned on top of [Bitext's customor support domain dataset](https://huggingface.co/datasets/bitext/customer-support-intent-dataset).
The Supervised Fine-Tuning (SFT) method is based on this [qlora paper](https://arxiv.org/abs/2305.14314) using 🤗 peft adapters, transformers, and bitsandbytes.
## Model details
**Model type:**
longchat-7b-qlora-customer-support is an 4-bit LORA (low-rank) adapter supervised fine-tuned on top of the [longchat-7b-16k model](https://huggingface.co/lmsys/longchat-7b-16k) with [Bitext's customor support domain dataset](https://huggingface.co/datasets/bitext/customer-support-intent-dataset).
It's a Causal LM decoder-only LLM.
**Language:**
English
**License:**
apache-2.0 inherited from [Base Model](https://huggingface.co/lmsys/longchat-7b-16k) and the [dataset](https://huggingface.co/datasets/bitext/customer-support-intent-dataset).
**Base Model:**
lmsys/longchat-7b-16k
**Dataset:**
bitext/customer-support-intent-dataset
**GPU Mermory Consumption:**
~6GB GPU consumption in 4-bit mode with fully loaded (base + qlora) models
## Install dependcy packages
```shell
pip install -r requirements.txt
```
Per the [base model instruction](https://huggingface.co/lmsys/longchat-7b-16k), the [llma_condense_monkey_patch.py file](https://github.com/lm-sys/FastChat/blob/main/fastchat/model/llama_condense_monkey_patch.py) is needed to load the base model properly. This file is alreay included in this repo.
## Load the model in 4-bit mode
```ipython
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from llama_condense_monkey_patch import replace_llama_with_condense
from peft import PeftConfig
from peft import PeftModel
import torch
## config device params & load model
peft_model_id = "mingkuan/longchat-7b-qlora-customer-support"
base_model_id = "lmsys/longchat-7b-16k"
config = AutoConfig.from_pretrained(base_model_id)
replace_llama_with_condense(config.rope_condense_ratio)
tokenizer = AutoTokenizer.from_pretrained(base_model_id, use_fast=False)
kwargs = {"torch_dtype": torch.float16}
kwargs["device_map"] = "auto"
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
return_dict=True,
trust_remote_code=True,
quantization_config=nf4_config,
load_in_4bit=True,
**kwargs
)
model = PeftModel.from_pretrained(model, peft_model_id)
```
## Inference the model
```ipython
def getLLMResponse(prompt):
device = "cuda"
input_ids = tokenizer(prompt, return_tensors='pt').input_ids.cuda()
output = model.generate(inputs=input_ids, temperature=0.5, max_new_tokens=256)
promptLen = len(prompt)
response = tokenizer.decode(output[0], skip_special_tokens=True)[promptLen:] ## omit the user input part
return response
query = 'help me to setup my new shipping address.'
response = getLLMResponse(generate_prompt(query))
print(f'\nUserInput:{query}\n\nLLM:\n{response}\n\n')
```
Inference Output:
```shell
{
"category": "SHIPPING",
"intent": "setup_new_shipping_address",
"answer": "Sure, I can help you with that. Can you please provide me your full name, current shipping address, and the new shipping address you would like to set up?"
}
``` |