DiaClassModel / README.md

Update README.md

e47b8cc verified 3 months ago

4.21 kB

	---
	library_name: torch
	tags:
	- image-classification
	- resnet
	- diagrams
	- pytorch
	- computer-vision
	license: apache-2.0
	metrics:
	- accuracy
	- f1
	- recall
	- precision
	base_model:
	- microsoft/resnet-18
	pipeline_tag: image-classification
	datasets:
	- phiyodr/coco2017
	- HuggingFaceM4/ChartQA
	- JasmineQiuqiu/diagrams_with_captions_2
	---

	# Model Card for Diagram Classification Model

	## Model Details

	### Model Description

	This is a fine-tuned ResNet-18 model trained for binary image classification, distinguishing between diagrams and non-diagrams. The model is designed for use in applications that need automatic filtering or processing of diagram-based content.

	- Developed by: Aya Mohamed
	- Model type: ResNet-18 (Fine-tuned for image classification)
	- Language(s) (NLP): Not applicable (Computer Vision model)
	- License: Apache 2.0
	- Finetuned from model: `microsoft/resnet-18`

	### Model Sources

	- Repository: [Ayamohamed/diaclass-model](https://huggingface.co/Ayamohamed/diaclass-model)

	## Uses

	### Direct Use

	This model is intended for classifying images as diagrams or non-diagrams. It can be used in:
	- Document processing (extracting diagrams from PDFs or scanned documents)
	- Chart-based visual question generation (VQG)
	- Content moderation (filtering diagram images from general image datasets)

	### Out-of-Scope Use

	- Not suitable for multi-class classification beyond diagrams vs. non-diagrams.
	- Not designed for hand-drawn sketches or complex figures with mixed elements.

	## Bias, Risks, and Limitations

	- The model's accuracy depends on the training dataset, which may not cover all possible diagram styles.
	- May misclassify charts, blueprints, or artistic drawings if they resemble diagrams.

	### Recommendations

	Users should evaluate the model on their specific dataset before deployment to ensure it performs well in their context.



	## 🚀 How to Use

	### 1️⃣ Load the Model from Hugging Face
	You can download the model and load it using `torch`.

	```python
	import torch
	from huggingface_hub import hf_hub_download

	# Download model from Hugging Face Hub
	model_path = hf_hub_download(repo_id="Ayamohamed/DiaClassification", filename="model.pth")

	# Load model
	model_hg = torch.load(model_path)
	model_hg.eval() # Set to evaluation mode

	```
	### 2️⃣ Preprocess and Classify an Image
	```python
	from PIL import Image
	from torchvision import transforms

	# Define Image Transformations
	transform = transforms.Compose([
	transforms.Resize((224, 224)),
	transforms.ToTensor(),
	transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
	])
	def predict(image_path):
	image = Image.open(image_path).convert("RGB")
	image = transform(image).unsqueeze(0)
	with torch.no_grad():
	output = model_hg(image)
	class_idx = torch.argmax(output, dim=1).item()

	return "Diagram" if class_idx == 0 else "Not Diagram"

	# Example usage
	print(predict("my-diagram-classifier/31188_1536932698.jpg"))


	```



	## Training Details

	### Training Data

	The model was trained using:
	- ChartQA dataset (for diagram samples)
	- JasmineQiuqiu/diagrams_with_captions_2 (for diagram samples)
	- COCO dataset (subset) (for non-diagram samples)

	### Training Procedure

	- Pretrained model: `microsoft/resnet-18`
	- Optimization: Adam optimizer
	- Loss function: Cross-entropy loss
	- Training duration: Approx. X hours on an NVIDIA GPU

	## Evaluation

	### Testing Data & Metrics

	- Dataset: Held-out test set from ChartQA, AI2D-RST, and COCO
	- Metrics:
	- Test Loss: 0.0371
	- Test Accuracy: 99.08%
	- Precision: 0.9995
	- Recall: 0.9820
	- F1 Score: 0.9907

	## Environmental Impact

	- Hardware Used: NVIDIA A100 GPU
	- Compute Hours: Approx. X hours
	- Estimated Carbon Emission: [Use MLCO2 Calculator](https://mlco2.github.io/impact#compute)

	## Citation

	If you use this model, please cite:

	```bibtex
	@misc{aya2025diaclass,
	author = {Aya Mohamed},
	title = {Diagram Classification Model},
	year = {2025},
	publisher = {Hugging Face},
	url = {https://huggingface.co/Ayamohamed/diaclass-model}
	}
	```