Gemma2-27B-Swahili-IT

Gemma2-27B-Swahili-IT is a state-of-the-art open variant of Google's Gemma2-27B-IT model, fine-tuned for natural Swahili language understanding and generation. This model utilizes Quantized Low-Rank Adaptation (QLoRA) to achieve efficient fine-tuning while maintaining performance.

Model Details

  • Developer: Alfaxad Eyembe
  • Base Model: google/gemma-2-27b-it
  • Model Type: Decoder-only transformer
  • Language(s): Swahili
  • License: Apache 2.0
  • Finetuning Approach: QLoRA (4-bit quantization)

Training Data

The model was fine-tuned on a comprehensive dataset containing:

  • 67,017 instruction-response pairs
  • 16,273,709 total tokens
  • Average 242.83 tokens per example
  • High-quality, naturally-written Swahili content

Performance

Massive Multitask Language Understanding (MMLU) - Swahili

  • Base Model: 22.81% accuracy
  • Fine-tuned Model: 57.89% accuracy
  • Improvement: +35.08%

Swahili Sentiment Analysis

  • Base Model: 89.90% accuracy
  • Fine-tuned Model: 90.00% accuracy
  • Perfect response validity (100%)

Intended Use

This model is designed for:

  • Natural Swahili text generation
  • Question answering
  • Content analysis
  • Creative writing
  • General instruction following in Swahili

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

# Configure 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True
)

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("alfaxadeyembe/gemma2-27b-swahili-it")
model = AutoModelForCausalLM.from_pretrained(
    "alfaxadeyembe/gemma2-27b-swahili-it",
    quantization_config=bnb_config,
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Always set to eval mode for inference
model.eval()

# Example usage
prompt = "Eleza dhana ya uchumi wa kidijitali na umuhimu wake katika ulimwengu wa leo."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=500,
        do_sample=True,
        temperature=0.7,
        top_p=0.95
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Training Details

  • Fine-tuning Method: QLoRA (4-bit quantization)
  • Training Steps: 150
  • Batch Size: 1
  • Gradient Accumulation Steps: 64
  • Learning Rate: 1.5e-4
  • Training Time: ~10 hours on A100 GPU

Citation

@misc{gemma2-27b-swahili-it,
  author = {Alfaxad Eyembe},
  title = {Gemma2-27B-Swahili-IT: Swahili Variation of Gemma2-27b-it Model},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
}

Contact

For questions or feedback, please reach out through:

Downloads last month
6
Safetensors
Model size
27.2B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Alfaxad/gemma2-27b-swahili-it

Base model

google/gemma-2-27b
Finetuned
(22)
this model