metadata

library_name: transformers
license: apache-2.0
base_model: facebook/detr-resnet-50
tags:
  - generated_from_trainer
model-index:
  - name: detr_finetuned_cppe5
    results: []
datasets:
  - rishitdagli/cppe-5

Model Card for DETR Finetuned on CPPE-5

Model Overview

This model is a fine-tuned version of facebook/detr-resnet-50 on a custom dataset, likely focused on detecting personal protective equipment (PPE) items. The fine-tuning has optimized the model to recognize various PPE elements such as face shields, masks, gloves, and goggles.

The model is based on the DEtection TRansformer (DETR) architecture, leveraging a ResNet-50 backbone for feature extraction. This fine-tuned version retains DETR's core functionality, enabling object detection tasks but is specifically adjusted to detect items relevant to occupational safety or PPE.

Model Performance

The model achieves the following metrics on its evaluation set:

Loss: 1.2294
mAP (mean Average Precision):
- Overall: 0.2366
- 50 IoU threshold: 0.4852
- 75 IoU threshold: 0.2032
- Small objects: 0.1082
- Medium objects: 0.2086
- Large objects: 0.3408
mAR (mean Average Recall):
- At 1 detection: 0.2819
- At 10 detections: 0.4463
- At 100 detections: 0.4665
- Small objects: 0.249
- Medium objects: 0.4004
- Large objects: 0.5893

For specific categories (face shields, gloves, goggles, masks), the precision and recall vary, with room for improvement, particularly for small objects like goggles.

Intended Use and Limitations

Intended Use

Detecting personal protective equipment (PPE) in images or video streams.
Monitoring workplace safety by ensuring proper usage of PPE items such as masks, gloves, face shields, and goggles.
Suitable for industries like construction, healthcare, and manufacturing where PPE detection is critical for compliance and safety.

Limitations

The model may not generalize well to non-PPE items or general object detection tasks.
Performance on small or occluded objects can be limited, as indicated by lower mAP and mAR scores for small objects.
The model was trained on a dataset specific to PPE detection, so its performance on images outside of this domain might be inconsistent.

Training and Evaluation Data

The dataset used for fine-tuning remains unspecified, but it appears to focus on personal protective equipment, such as face shields, masks, goggles, and gloves.

Training Procedure

Hyperparameters:

Learning rate: 5e-05
Train batch size: 8
Eval batch size: 8
Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
Learning rate scheduler: Cosine decay
Number of epochs: 30
Seed: 42

The model was trained for 30 epochs with Adam optimization, using a learning rate of 5e-05 and cosine learning rate decay. The training was conducted with a batch size of 8 for both training and evaluation.

Evaluation Results

The following are performance metrics captured during the training process across multiple epochs:

Epoch	Validation Loss	mAP	mAP 50	mAP 75	mAR	Comments
1	2.1073	0.0518	0.1075	0.0423	0.2819	Initial training
5	1.6220	0.1223	0.2258	0.1115	0.4463	Significant improvement
10	1.5033	0.155	0.3265	0.1325	0.5032	Stable performance
20	1.2649	0.2211	0.4427	0.1952	0.5867	Peak performance
25	1.2347	0.2333	0.4831	0.1989	0.5966	Final metrics

Limitations and Ethical Considerations

Limitations:

Domain-specific: The model performs well in PPE-related object detection but may not generalize to other tasks.
Bias: If the dataset is skewed or limited, certain PPE items may be under-represented, leading to poorer performance for some categories.
Real-time Applications: The model might not meet the latency requirements for real-time detection in high-throughput environments.

Ethical Considerations:

Privacy: Using this model in surveillance scenarios (e.g., workplaces) may raise concerns about employee privacy, especially if applied without clear consent.
Misuse: Improper use of this model could lead to incorrect enforcement of safety regulations.

Future Work

Dataset Improvements: Expanding the dataset to include more diverse PPE items, environments, and object scales could improve model performance, especially for smaller objects.
Model Efficiency: Further fine-tuning or model distillation may help make the model more suitable for real-time applications.