Image Classification
PyTorch
torch
resnet
diagrams
computer-vision
DiaClassModel / README.md
Ayamohamed's picture
Update README.md
e47b8cc verified
---
library_name: torch
tags:
- image-classification
- resnet
- diagrams
- pytorch
- computer-vision
license: apache-2.0
metrics:
- accuracy
- f1
- recall
- precision
base_model:
- microsoft/resnet-18
pipeline_tag: image-classification
datasets:
- phiyodr/coco2017
- HuggingFaceM4/ChartQA
- JasmineQiuqiu/diagrams_with_captions_2
---
# Model Card for Diagram Classification Model
## Model Details
### Model Description
This is a fine-tuned ResNet-18 model trained for binary image classification, distinguishing between **diagrams** and **non-diagrams**. The model is designed for use in applications that need automatic filtering or processing of diagram-based content.
- **Developed by:** Aya Mohamed
- **Model type:** ResNet-18 (Fine-tuned for image classification)
- **Language(s) (NLP):** Not applicable (Computer Vision model)
- **License:** Apache 2.0
- **Finetuned from model:** `microsoft/resnet-18`
### Model Sources
- **Repository:** [Ayamohamed/diaclass-model](https://huggingface.co/Ayamohamed/diaclass-model)
## Uses
### Direct Use
This model is intended for classifying images as **diagrams** or **non-diagrams**. It can be used in:
- **Document processing** (extracting diagrams from PDFs or scanned documents)
- **Chart-based visual question generation (VQG)**
- **Content moderation** (filtering diagram images from general image datasets)
### Out-of-Scope Use
- Not suitable for **multi-class classification** beyond diagrams vs. non-diagrams.
- Not designed for **hand-drawn sketches** or **complex figures with mixed elements**.
## Bias, Risks, and Limitations
- The model's accuracy depends on the training dataset, which may not cover all possible diagram styles.
- May misclassify **charts, blueprints, or artistic drawings** if they resemble diagrams.
### Recommendations
Users should **evaluate the model** on their specific dataset before deployment to ensure it performs well in their context.
## 🚀 How to Use
### **1️⃣ Load the Model from Hugging Face**
You can download the model and load it using `torch`.
```python
import torch
from huggingface_hub import hf_hub_download
# Download model from Hugging Face Hub
model_path = hf_hub_download(repo_id="Ayamohamed/DiaClassification", filename="model.pth")
# Load model
model_hg = torch.load(model_path)
model_hg.eval() # Set to evaluation mode
```
### **2️⃣ Preprocess and Classify an Image**
```python
from PIL import Image
from torchvision import transforms
# Define Image Transformations
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
def predict(image_path):
image = Image.open(image_path).convert("RGB")
image = transform(image).unsqueeze(0)
with torch.no_grad():
output = model_hg(image)
class_idx = torch.argmax(output, dim=1).item()
return "Diagram" if class_idx == 0 else "Not Diagram"
# Example usage
print(predict("my-diagram-classifier/31188_1536932698.jpg"))
```
## Training Details
### Training Data
The model was trained using:
- **ChartQA dataset** (for diagram samples)
- **JasmineQiuqiu/diagrams_with_captions_2** (for diagram samples)
- **COCO dataset (subset)** (for non-diagram samples)
### Training Procedure
- **Pretrained model:** `microsoft/resnet-18`
- **Optimization:** Adam optimizer
- **Loss function:** Cross-entropy loss
- **Training duration:** Approx. X hours on an NVIDIA GPU
## Evaluation
### Testing Data & Metrics
- **Dataset:** Held-out test set from ChartQA, AI2D-RST, and COCO
- **Metrics:**
- **Test Loss:** 0.0371
- **Test Accuracy:** 99.08%
- **Precision:** 0.9995
- **Recall:** 0.9820
- **F1 Score:** 0.9907
## Environmental Impact
- **Hardware Used:** NVIDIA A100 GPU
- **Compute Hours:** Approx. X hours
- **Estimated Carbon Emission:** [Use MLCO2 Calculator](https://mlco2.github.io/impact#compute)
## Citation
If you use this model, please cite:
```bibtex
@misc{aya2025diaclass,
author = {Aya Mohamed},
title = {Diagram Classification Model},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/Ayamohamed/diaclass-model}
}
```