---
language:
- en
license: cc-by-4.0
library_name: peft
datasets:
- Salesforce/xlam-function-calling-60k
---

## Model Details

This is an adapter for [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) fine-tuned for function calling on xLAM. This adapter is undertrained. Its main purpose is for testing function calling capabilities of LLMs.

```
import torch, os
from peft import PeftModel
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer
)

#use bf16 and FlashAttention if supported
if torch.cuda.is_bf16_supported():
  os.system('pip install flash_attn')
  compute_dtype = torch.bfloat16
  attn_implementation = 'flash_attention_2'
else:
  compute_dtype = torch.float16
  attn_implementation = 'sdpa'

adapter= "kaitchup/Meta-Llama-3-8B-xLAM-Adapter"
model_name = "meta-llama/Meta-Llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=compute_dtype,
    device_map={"": 0},
    attn_implementation=attn_implementation,
)

model = PeftModel.from_pretrained(model, adapter)

prompt = "<user>Check if the numbers 8 and 1233 are powers of two.</user>\n\n<tools>"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, do_sample=False, temperature=0.0, max_new_tokens=150)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
```


- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
- **Language(s) (NLP):** English
- **License:** cc-by-4.0