|
--- |
|
base_model: unsloth/gemma-2-2b-bnb-4bit |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- gemma2 |
|
- trl |
|
datasets: |
|
- Salesforce/xlam-function-calling-60k |
|
library_name: peft |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
This model is a function calling version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) finetuned on the [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) dataset. |
|
|
|
|
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** akshayballal |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/gemma-2-2b-bnb-4bit |
|
|
|
|
|
### Usage |
|
|
|
```python |
|
from unsloth import FastLanguageModel |
|
|
|
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally! |
|
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+ |
|
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False. |
|
|
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name = "gemma2-2b-xlam-function-calling", # YOUR MODEL YOU USED FOR TRAINING |
|
max_seq_length = 1024, |
|
dtype = dtype, |
|
load_in_4bit = load_in_4bit, |
|
) |
|
FastLanguageModel.for_inference(model) # Enable native 2x faster inference |
|
|
|
|
|
alpaca_prompt = """Below are the tools that you have access to these tools. Use them if required. |
|
|
|
### Tools: |
|
{} |
|
|
|
### Query: |
|
{} |
|
|
|
### Response: |
|
{}""" |
|
|
|
tools = [ |
|
{ |
|
"name": "upcoming", |
|
"description": "Fetches upcoming CS:GO matches data from the specified API endpoint.", |
|
"parameters": { |
|
"content_type": { |
|
"description": "The content type for the request, default is 'application/json'.", |
|
"type": "str", |
|
"default": "application/json", |
|
}, |
|
"page": { |
|
"description": "The page number to retrieve, default is 1.", |
|
"type": "int", |
|
"default": "1", |
|
}, |
|
"limit": { |
|
"description": "The number of matches to retrieve per page, default is 10.", |
|
"type": "int", |
|
"default": "10", |
|
}, |
|
}, |
|
} |
|
] |
|
query = """Can you fetch the upcoming CS:GO matches for page 1 with a 'text/xml' content type and a limit of 20 matches? Also, can you fetch the upcoming matches for page 2 with the 'application/xml' content type and a limit of 15 matches?""" |
|
|
|
FastLanguageModel.for_inference(model) |
|
|
|
model_input = tokenizer(alpaca_prompt.format(tools, query, ""), return_tensors="pt") |
|
|
|
output = model.generate(**input, max_new_tokens=1024, temperature = 0.0) |
|
|
|
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True) |
|
``` |
|
|
|
|
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |