Meta-Llama-3-8B-Instruct_bitsandbytes_4bit fine-tuned on Salesforce/xlam-function-calling-60k
Function-Calling Agent
LoRA Adpater Head
Parameter Efficient Finetuning (PEFT) a 4bit quantized Meta-Llama-3-8B-Instruct on Salesforce/xlam-function-calling-60k dataset.
- Language(s) (NLP): English
- License: openrail
- Qunatization: BitsAndBytes
- PEFT: LoRA
- Finetuned from model SwastikM/Meta-Llama-3-8B-Instruct_bitsandbytes_4bit
- Dataset: Salesforce/xlam-function-calling-60k dataset
Intended uses & limitations
Addressing the efficay of Quantization and PEFT. Implemented as a personal Project.
How to use
Install Required Libraries
!pip install transformers accelerate bitsandbytes>0.37.0
!pip install peft
Setup Adapter with Base Model
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer,AutoModelForCausalLM
from peft import PeftModel, PeftConfig, get_peft_model
import torch
base_model = AutoModelForCausalLM.from_pretrained("SwastikM/Meta-Llama-3-8B-Instruct_bitsandbytes_4bit",device_map="auto")
model = PeftModel.from_pretrained(base_model, "SwastikM/Meta-Llama3-8B-Chat-Adapter")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
model = model.to("cuda")
model.eval()
Setup Template and Infer
x1 = {"role": "system", "content": """You are a APIGen Function Calling Tool. You will br provided with a user query and associated tools for answering the query.
query (string): The query or problem statement.
tools (array): An array of available tools that can be used to solve the query.
Each tool is represented as an object with the following properties:
name (string): The name of the tool.
description (string): A brief description of what the tool does.
parameters (object): An object representing the parameters required by the tool.
Each parameter is represented as a key-value pair, where the key is the parameter name and the value is an object with the following properties:
type (string): The data type of the parameter (e.g., "int", "float", "list").
description (string): A brief description of the parameter.
required (boolean): Indicates whether the parameter is required or optional.
You will provide the Answer array.
Answers array provides the specific tool and arguments used to generate each answer."""}
x2 = {"role": "user", "content": None}
x3 = {"role": "assistant", "content": None}
user_template = 'Query: {Q} Tools: {T}'
response_template = '{A}'
Q = "Where can I find live giveaways for beta access and games?"
T = """[{"name": "live_giveaways_by_type", "description": "Retrieve live giveaways from the GamerPower API based on the specified type.", "parameters": {"type": {"description": "The type of giveaways to retrieve (e.g., game, loot, beta).", "type": "str", "default": "game"}}}]"""
x2['content'] = f'{user_template.format(Q=Q,T=T)}'
prompts = [x1,x2]
input_ids = tokenizer.apply_chat_template(
prompts,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
Size Comparison
The table shows comparison VRAM requirements for loading and training of FP16 Base Model and 4bit bnb quantized model with PEFT. The value for base model referenced from Model Memory Calculator from HuggingFace
Model | Total Size | Training Using Adam |
---|---|---|
Base Model | 28.21 GB | 56.42 GB |
4bitQuantized+PEFT | 5.21 GB | 13 GB |
Training Details
Training Data
Dataset: Salesforce/xlam-function-calling-60k dataset
Trained on instruction
column of 20,00 randomly shuffled data.
Training Procedure
HuggingFace Accelerate with Training Loop.
Training Hyperparameters
- Optimizer: AdamW
- lr: 2e-5
- decay: linear
- batch_size: 1
- gradient_accumulation_steps: 2
- fp16: True
LoraConfig
- r: 8
- lora_alpha: 32
- task_type: TaskType.CAUSAL_LM
- lora_dropout: 0.1
Hardware
- GPU: P100
Acknowledgment
- Thanks to @AMerve Noyan for precise intro.
- Thanks to @HuggungFace Team for the Blog.
- Thanks to @Salesforce for the marvelous dataset.
Model Card Authors
Swastik Maiti
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.