--- language: - en license: cc-by-4.0 library_name: peft datasets: - Salesforce/xlam-function-calling-60k --- ## Model Details This is an adapter for [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) fine-tuned for function calling on xLAM. This adapter is undertrained. Its main purpose is for testing function calling capabilities of LLMs. ``` import torch, os from peft import PeftModel from transformers import ( AutoModelForCausalLM, AutoTokenizer ) #use bf16 and FlashAttention if supported if torch.cuda.is_bf16_supported(): os.system('pip install flash_attn') compute_dtype = torch.bfloat16 attn_implementation = 'flash_attention_2' else: compute_dtype = torch.float16 attn_implementation = 'sdpa' adapter= "kaitchup/Meta-Llama-3-8B-xLAM-Adapter" model_name = "meta-llama/Meta-Llama-3-8B" tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=compute_dtype, device_map={"": 0}, attn_implementation=attn_implementation, ) model = PeftModel.from_pretrained(model, adapter) prompt = "Check if the numbers 8 and 1233 are powers of two.\n\n" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, do_sample=False, temperature=0.0, max_new_tokens=150) result = tokenizer.decode(outputs[0], skip_special_tokens=True) print(result) ``` - **Developed by:** [The Kaitchup](https://kaitchup.substack.com/) - **Language(s) (NLP):** English - **License:** cc-by-4.0