Model Specifications
- Max Sequence Length: Trained at 16384 (via RoPE Scaling)
- Data Type: Auto detection, with options for Float16 and Bfloat16
- Quantization: 4bit, to reduce memory usage
Training Data
Used a private dataset with hundreds of technical tutorials and associated summaries.
Implementation Highlights
- Efficiency: Emphasis on reducing memory usage and accelerating download speeds through 4bit quantization.
- Adaptability: Auto detection of data types and support for advanced configuration options like RoPE scaling, LoRA, and gradient checkpointing.
Uploaded Model
- Developed by: ndebuhr
- License: apache-2.0
- Finetuned from model : unsloth/gemma-2-27b-it-bnb-4bit
Configuration and Usage
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
input_text = ""
# Set device based on CUDA availability
device = "cuda" if torch.cuda.is_available() else "cpu"
# Load the model and tokenizer
model_name = "ndebuhr/Gemma-2-27B-Technical-Tutorial-Summarization-QLoRA"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to(device)
instruction = "Clarify and summarize this tutorial transcript"
prompt = """{}
### Raw Transcript:
{}
### Summary:
"""
# Tokenize the input text
inputs = tokenizer(
prompt.format(instruction, input_text),
return_tensors="pt",
truncation=True,
max_length=16384
).to(device)
# Generate outputs
outputs = model.generate(
**inputs,
max_length=16384,
num_return_sequences=1,
use_cache=True
)
# Decode the generated text
generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)
Compute Infrastructure
- Fine-tuning: used 1xA100 (40GB)
- Inference: recommend 1xL4 (24GB)
This gemma2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 18
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for ndebuhr/Gemma-2-27B-Technical-Tutorial-Summarization-QLoRA
Base model
unsloth/gemma-2-27b-it-bnb-4bit