Model Card for Katara Detector

This model identifies whether an image contains Katara from Avatar: The Last Airbender. It achieves 96% accuracy and 96.1% F1 score on the validation set.

Model Details

Model Description

A binary image classifier that determines if Katara from the animated series "Avatar: The Last Airbender" is present in an image.

  • Developed by: Your Name/Organization
  • Model type: Image Classification
  • License: MIT
  • Finetuned from model: facebook/dinov2-small

Uses

Direct Use

This model can be used to:

  • Identify Katara in screenshots or fan art
  • Filter or categorize ATLA-related image collections
  • Power fan applications that track character appearances
# Use a pipeline as a high-level helper
from PIL import Image
from transformers import pipeline

pipe = pipeline("image-classification", model="lumenggan/katara-detector")

image = Image.open("yourimage.png")

pipe(image)

Out-of-Scope Use

This model should not be used for:

  • Critical identification tasks
  • Monitoring or surveillance purposes
  • Making judgments about real people

Training Details

Training Data

The model was trained on a custom dataset of Katara images and non-Katara images from Avatar: The Last Airbender. The dataset was split 80/20 for training and validation.

Training Procedure

The model was fine-tuned from DINOv2-small using the following techniques:

  • Dropout regularization (rate=0.3)
  • Weight decay (0.01-0.05)
  • Cosine learning rate schedule with restarts

Training Hyperparameters

  • Learning rate: 2e-5
  • Weight decay: 0.01-0.05
  • Epochs: 5-15
  • Batch size: 16 (effective 32 with gradient accumulation)
  • Training regime: fp16 mixed precision

Evaluation

Metrics

  • Accuracy: 96.0%
  • F1 Score: 96.1%
  • Precision: 96.8%
  • Recall: 95.5%
  • ROC AUC: 99.4%

Technical Specifications

Model Architecture

  • Base model: facebook/dinov2-with-registers-small
  • Custom classification head with dropout
  • Input size: 224x224 RGB images

Compute Infrastructure

  • GPU: (e.g., NVIDIA T4, A100, etc.)
  • Training time: Approximately 1-2 hours

Model Card Contact

https://github.com/unLomTrois/

Downloads last month
0
Safetensors
Model size
22.1M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for lumenggan/katara-detector

Finetuned
(8)
this model

Dataset used to train lumenggan/katara-detector