How to Get Started with the Model

πŸš€ How to Use This Model for Inference

This model is fine-tuned using LoRA (PEFT) on Phi-4 (4-bit Unsloth). To use it, you need to:

  1. Load the base model
  2. Load the LoRA adapter
  3. Run inference

πŸ“Œ Install Required Libraries

Before running the code, make sure you have the necessary dependencies installed:

pip install unsloth peft transformers torch

πŸ“ Load and Run Inference


from unsloth import FastLanguageModel
from peft import PeftModel
import torch

# Load the base model
base_model_name = "unsloth/Phi-4-unsloth-bnb-4bit"
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=base_model_name,
    max_seq_length=4096,  # Must match fine-tuning
    load_in_4bit=True,
)

# Load the fine-tuned LoRA adapter
lora_model_name = "Machlovi/Phi_Fullshot"
model = PeftModel.from_pretrained(model, lora_model_name)

# Run inference
input_text = "Why do we need to go to see something?"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=4)

# Decode and print response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)


πŸ’‘ Notes

  • This model is quantized in 4-bit for efficiency.
  • Ensure max_seq_length matches the training configuration.
  • This model requires a GPU (CUDA) for inference.

[More Information Needed]

Uploaded model

  • Developed by: Machlovi
  • License: apache-2.0
  • Finetuned from model : unsloth/Phi-4-unsloth-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including Machlovi/Safe_Phi4