Chat Moderators
Collection
Finetuned version of Phi Modesl
β’
6 items
β’
Updated
This model is fine-tuned using LoRA (PEFT) on Phi-4 (4-bit Unsloth). To use it, you need to:
Before running the code, make sure you have the necessary dependencies installed:
pip install unsloth peft transformers torch
from unsloth import FastLanguageModel
from peft import PeftModel
import torch
# Load the base model
base_model_name = "unsloth/Phi-4-unsloth-bnb-4bit"
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=base_model_name,
max_seq_length=4096, # Must match fine-tuning
load_in_4bit=True,
)
# Load the fine-tuned LoRA adapter
lora_model_name = "Machlovi/Phi_Fullshot"
model = PeftModel.from_pretrained(model, lora_model_name)
# Run inference
input_text = "Why do we need to go to see something?"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=4)
# Decode and print response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
max_seq_length
matches the training configuration.[More Information Needed]
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.