Model Overview

A LoRA (Low-Rank Adaptation) fine-tuned adapter for the Llama-3.1-8B language model.

Model Details

  • Base Model: meta-llama/Llama-3.1-8B-instruct
  • Adaptation Method: LoRA

Training Configuration

Training Hyperparameters

  • Learning Rate: 85e-6
  • Batch Size: 2
  • Number of Epochs: 1
  • Training Steps: ~1,400
  • Precision: "BF16"

LoRA Configuration

  • Rank (r): 16
  • Alpha: 16
  • Target Modules:
    • q_proj (Query projection)
    • k_proj (Key projection)
    • v_proj (Value projection)
    • o_proj (Output projection)
    • up_proj (Upsampling projection)
    • down_proj (Downsampling projection)
    • gate_proj (Gate projection)

Usage

This adapter must be used in conjunction with the base Llama-3.1-8B model.

Loading the Model

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-instruct")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-instruct")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "path_to_adapter")

Limitations and Biases

  • This adapter might inherits some limitations and biases present in the base Llama-3.1-8B-instruct model
  • The training dataset size (~1k steps) is relatively small, which may limit the adapter's effectiveness
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support