devanshamin's picture
Update README.md
486a7e9 verified
---
base_model: Qwen/Qwen2-1.5B-Instruct
datasets:
- devanshamin/gem-viggo-function-calling
library_name: peft
license: apache-2.0
pipeline_tag: text-generation
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: Qwen2-1.5B-Instruct-Function-Calling-v1
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# Qwen2-1.5B-Instruct-Function-Calling-v1
This model is a fine-tuned version of [Qwen/Qwen2-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2-1.5B-Instruct) on [devanshamin/gem-viggo-function-calling](https://huggingface.co/datasets/devanshamin/gem-viggo-function-calling) dataset.
## Updated Chat Template
> Note: The template supports multiple tools but the model is fine-tuned on a dataset consisting of examples with a single tool.
- The chat template has been added to the [tokenizer_config.json](https://huggingface.co/devanshamin/Qwen2-1.5B-Instruct-Function-Calling-v1/blob/7ee7c020cefdb0101939469de608acc2afa7809e/tokenizer_config.json#L34).
- Supports prompts with and without tools.
```python
chat_template = (
"{% for message in messages %}"
"{% if loop.first and messages[0]['role'] != 'system' %}"
"{% if tools %}"
"<|im_start|>system\nYou are a helpful assistant with access to the following tools. Use them if required - \n"
"```json\n{{ tools | tojson }}\n```<|im_end|>\n"
"{% else %}"
"<|im_start|>system\nYou are a helpful assistant.\n<|im_end|>\n"
"{% endif %}"
"{% endif %}"
"{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}"
"{% endfor %}"
"{% if add_generation_prompt %}"
"{{ '<|im_start|>assistant\n' }}"
"{% endif %}"
)
```
## Basic Usage
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Qwen2-1.5B-Instruct-Function-Calling-v1"
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float32, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id)
def inference(prompt: str) -> str:
model_inputs = tokenizer([prompt], return_tensors="pt").to('cuda')
generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512)
generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
return response
messages = [{"role": "user", "content": "What is the speed of light?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = inference(prompt)
print(response)
```
## Tool Usage
### Basic
```python
import json
from typing import List, Dict
def get_prompt(user_input: str, tools: List[Dict] | None = None):
prompt = 'Extract the information from the following - \n{}'.format(user_input)
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
tools=tools
)
return prompt
tool = {
"type": "function",
"function": {
"name": "get_company_info",
"description": "Correctly extracted company information with all the required parameters with correct types",
"parameters": {
"properties": {
"name": {"title": "Name", "type": "string"},
"investors": {
"items": {"type": "string"},
"title": "Investors",
"type": "array"
},
"valuation": {"title": "Valuation", "type": "string"},
"source": {"title": "Source", "type": "string"}
},
"required": ["investors", "name", "source", "valuation"],
"type": "object"
}
}
}
input_text = "Founded in 2021, Pluto raised $4 million across multiple seed funding rounds, valuing the company at $12 million (pre-money), according to PitchBook. The startup was backed by investors including Switch Ventures, Caffeinated Capital and Maxime Seguineau."
prompt = get_prompt(input_text, tools=[tool])
response = inference(prompt)
print(response)
# ```json
# {
# "name": "get_company_info",
# "arguments": {
# "name": "Pluto",
# "investors": [
# "Switch Ventures",
# "Caffeinated Capital",
# "Maxime Seguineau"
# ],
# "valuation": "$12 million",
# "source": "PitchBook"
# }
# }
# ```
```
### Advanced
```python
import re
from enum import Enum
from pydantic import BaseModel, Field # pip install pydantic
from instructor.function_calls import openai_schema # pip install instructor
# Define functions using pydantic classes
class PaperCategory(str, Enum):
TYPE_1_DIABETES = 'Type 1 Diabetes'
TYPE_2_DIABETES = 'Type 2 Diabetes'
class Classification(BaseModel):
label: PaperCategory = Field(..., description='Provide the most likely category')
reason: str = Field(..., description='Give a detailed explanation with quotes from the abstract explaining why the paper is related to the chosen label.')
function_definition = openai_schema(Classification).openai_schema
tool = dict(type='function', function=function_definition)
input_text = "1,25-dihydroxyvitamin D(3) (1,25(OH)(2)D(3)), the biologically active form of vitamin D, is widely recognized as a modulator of the immune system as well as a regulator of mineral metabolism. The objective of this study was to determine the effects of vitamin D status and treatment with 1,25(OH)(2)D(3) on diabetes onset in non-obese diabetic (NOD) mice, a murine model of human type I diabetes. We have found that vitamin D-deficiency increases the incidence of diabetes in female mice from 46% (n=13) to 88% (n=8) and from 0% (n=10) to 44% (n=9) in male mice as of 200 days of age when compared to vitamin D-sufficient animals. Addition of 50 ng of 1,25(OH)(2)D(3)/day to the diet prevented disease onset as of 200 days and caused a significant rise in serum calcium levels, regardless of gender or vitamin D status. Our results indicate that vitamin D status is a determining factor of disease susceptibility and oral administration of 1,25(OH)(2)D(3) prevents diabetes onset in NOD mice through 200 days of age."
prompt = get_prompt(input_text, tools=[tool])
output = inference(prompt)
print(output)
# ```json
# {
# "name": "Classification",
# "arguments": {
# "label": "Type 1 Diabetes",
# "reason": "The study investigated the effect of vitamin D status and treatment with 1,25(OH)(2)D(3) on diabetes onset in non-obese diabetic (NOD) mice. It also concluded that vitamin D deficiency leads to an increase in diabetes incidence and that the addition of 1,25(OH)(2)D(3) can prevent diabetes onset in NOD mice."
# }
# }
# ```
# Extract JSON string using regex
output = re.search(r'```json\s*(\{.*?\})\s*```', output).group(1)
output = Classification(**json.loads(_output)['arguments'])
print(output)
# Classification(label=<PaperCategory.TYPE_1_DIABETES: 'Type 1 Diabetes'>, reason='The study investigated the effect of vitamin D status and treatment with 1,25(OH)(2)D(3) on diabetes onset in non-obese diabetic (NOD) mice. It also concluded that vitamin D deficiency leads to an increase in diabetes incidence and that the addition of 1,25(OH)(2)D(3) can prevent diabetes onset in NOD mice.')
```
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 10
- training_steps: 200
### Training results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 0.4004 | 0.0101 | 20 | 0.4852 |
| 0.3624 | 0.0201 | 40 | 0.3221 |
| 0.2855 | 0.0302 | 60 | 0.2818 |
| 0.2652 | 0.0402 | 80 | 0.2592 |
| 0.2214 | 0.0503 | 100 | 0.2463 |
| 0.2471 | 0.0603 | 120 | 0.2358 |
| 0.2122 | 0.0704 | 140 | 0.2310 |
| 0.2048 | 0.0804 | 160 | 0.2275 |
| 0.2406 | 0.0905 | 180 | 0.2251 |
| 0.2445 | 0.1006 | 200 | 0.2248 |
### Framework versions
```text
peft==0.11.1
transformers==4.42.3
torch==2.3.1+cu121
datasets==2.20.0
tokenizers==0.19.1
```