jaranohaal
/

vit-base-violence-detection

Image Classification

violence-detection

Inference Endpoints

Model card Files Files and versions Community

vit-base-violence-detection / README.md

jaranohaal's picture

Update README.md

aa884ae verified 6 months ago

|

1.74 kB

	---
	language: en
	datasets:
	- abdulmananraja/real-life-violence-situations
	tags:
	- image-classification
	- vision
	- violence-detection
	license: apache-2.0
	---

	# ViT Base Violence Detection

	## Model Description

	This is a Vision Transformer (ViT) model fine-tuned for violence detection. The model is based on [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k) and has been trained on the [Real Life Violence Situations](https://www.kaggle.com/datasets/mohamedmustafa/real-life-violence-situations-dataset) dataset from Kaggle to classify images into violent or non-violent categories.

	## Intended Use

	The model is intended for use in applications where detecting violent content in images is necessary. This can include:

	- Content moderation
	- Surveillance
	- Parental control software

	## Model accuracy

	Test accuracy for Vit Base = 98.80%
	Loss = 0.20038144290447235

	## How to Use

	Here is an example of how to use this model for image classification:

	```python
	import torch
	from transformers import ViTForImageClassification, ViTFeatureExtractor
	from PIL import Image

	# Load the model and feature extractor
	model = ViTForImageClassification.from_pretrained('jaranohaal/vit-base-violence-detection')
	feature_extractor = ViTFeatureExtractor.from_pretrained('jaranohaal/vit-base-violence-detection')

	# Load an image
	image = Image.open('image.jpg')

	# Preprocess the image
	inputs = feature_extractor(images=image, return_tensors="pt")

	# Perform inference
	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits
	predicted_class_idx = logits.argmax(-1).item()

	# Print the predicted class
	print("Predicted class:", model.config.id2label[predicted_class_idx])