File size: 6,037 Bytes

b088679
4f49fe5
86a5c2a
ebfde47
 
86a5c2a
ebfde47
 
86a5c2a
ebfde47
b088679
86a5c2a
7e6e0bb
86a5c2a
ebfde47
 
7e6e0bb
 
 
ebfde47
 
 
 
 
7e6e0bb
6af8fad
7e6e0bb
ebfde47
 
 
 
 
 
 
 
 
 
 
 
 
 
7e6e0bb
6af8fad
7e6e0bb
ebfde47
 
 
 
 
7e6e0bb
6af8fad
7e6e0bb
ebfde47
 
 
7e6e0bb
 
 
ebfde47
 
 
 
 
 
7e6e0bb
ebfde47
7e6e0bb
ebfde47
 
 
 
7e6e0bb
6af8fad
7e6e0bb
ebfde47
 
 
 
7e6e0bb
6af8fad
7e6e0bb
ebfde47
 
 
 
7e6e0bb
 
 
ebfde47
 
 
7e6e0bb
ebfde47
7e6e0bb
ebfde47
 
9b21ca6
 
 
 
 
7e6e0bb
 
6af8fad
7e6e0bb
ebfde47
 
 
 
 
 
7e6e0bb
 
 
ebfde47

---
license: eupl-1.2
datasets:
  - NetherlandsForensicInstitute/vuurwerkverkenner-development-data
  - NetherlandsForensicInstitute/vuurwerkverkenner-application-data
language:
  - en
  - nl
metrics:
  - accuracy
---

# Model Card Vuurwerkverkenner

This model, developed by the Netherlands Forensic Institute, is designed to link fragments from exploded fireworks to
their corresponding firework types. An application utilizing this model is available at www.vuurwerkverkenner.nl.

## Architecture

The classification process involves two components: an embedding model that generates embeddings, and a classification
model that determines classifications based on the distances between these embeddings. While the classification
component aids in model evaluation, in practice, the embedding model compares embeddings of wrappers in the database to
the embedding of the snippet image provided. This setup allows for the addition of new wrappers without the need to
retrain the model.

### _Embedding model_

Initially, we train an embedding model that ensures similar embeddings for snippets from the same source, and diverse
embeddings for snippets from different sources. This model is based on the Vision Transformer
architecture ([arXiv](https://arxiv.org/abs/2010.11929)) and
fine-tuned with the following specifications:

* Model: ViT-B/32, with an L2-normalized linear layer as embedding head
* Input: RBG image of 448x448 pixels
* Output/embedding layer size: 128
* Training loss: ProxyAnchorLoss (
  see [here](https://kevinmusgrave.github.io/pytorch-metric-learning/losses/#proxyanchorloss)) with margin = 0.5 and
  alpha = 64
* Fixed learning rate of 1e-4 for the model weights and 1e-2 for the proxy vectors with AdamW optimizer
* Batch size: 150
* Epochs: 20

### _Classification_

To connect a snippet photo to a firework wrapper, reference embeddings are generated for comparison from a background
dataset using the trained embedding model. Similarly, we generate an embedding for the snippet photo. Classification is
achieved by calculating the cosine distance between the snippet photo embedding and the reference embeddings for each
firework wrapper. The minimum distance among the reference embeddings determines the representative score for each
category.

#### _Text filter_

A text filter can be optionally applied following classification, which matches fireworks labels based on text found on
the snippet. The snippet text must be manually entered, and all text fragments must be present on the label to get a
a match.

## Data

The model is trained and evaluated using data from fireworks involved in cases at the Netherlands Forensic Institute
since 2010. The dataset is divided into three parts, with the train and validation used in the training and model
selection and final model in the application trained on all data except for a holdout set. Further information on the
development and application data can be
found [here](https://huggingface.co/datasets/NetherlandsForensicInstitute/vuurwerkverkenner-development-data)
and [here](https://huggingface.co/datasets/NetherlandsForensicInstitute/vuurwerkverkenner-application-data).

### _Real snippets_

We have generated snippets for the available firework categories by detonating the fireworks. These real snippets (also 
called 'lab snippets') are photographed with a high-quality DSLR camera against a white background, with optimal 
lighting conditions. The snippets are segmented, distributed across train, validation, and holdout sets, 
and grouped into images containing 1 to 10 snippets.

### _Mock-crime scene snippets_

In certain categories, we have created photos that imitate crime scene conditions, e.g. by using suboptimal lighting
and/or a phone camera. To optimize model performance, less
background noise is desirable, hence photos are created with snippets set against 'DNA blankets,' providing a somewhat
uniform background. 

### _Artificial snippets_

To ensure the embedding model outputs embeddings for all firework wrappers, including those without real snippets, we
create 'artificial snippets' by randomly cropping wrapper images. Each artificial snippet image comprises 1 to 10
snippet pieces, creating a number of images per wrapper. An additional set is generated for each wrapper to serve
as the reference dataset of which embeddings are stored for comparisons against the provided image in the application.

## Evaluation

To assess differences in performance across conditions, we formulate a test set featuring artificial, real, and mock-pd
images. The evaluation encompasses the entire set and reviews snippet types and performance across categories with
numerous similar wrappers.

### _Metrics_

| Metric                              | Value              |
| ----------------------------------- | ------------------ |
| RecallAtKValidator(k=1)             | 0.9475017269168777  |
| RecallAtKValidator(k=3)             | 0.9715634354133088  |
| RecallAtKValidator(k=5)             | 0.9757080359198711  |
| CategoricalRecallAtKValidator(k=1)  | 0.985493898227032  |
| CategoricalRecallAtKValidator(k=5)  | 0.9945889937830993  |


### _Limitations_

The evaluation results may not depict the model's real-world performance due to several factors. Training and testing
have occurred exclusively with snippets featuring plain backgrounds and optimal lighting. This might not always be
achievable in practice, as model performance is likely heightened with better-quality photos, ample distinctive
snippets, and properly entered text. Conversely, performance may diminish when these criteria are unmet. Additionally, 
if the firework type under scrutiny is novel or rare, it may be absent from the reference database and thus 
unattainable by the model.

## Using the model

This model is intended for use with the Vuurwerkverkenner application, which includes the necessary code for operation.
The application's source code can be accessed
on [GitHub](https://github.com/NetherlandsForensicInstitute/vuurwerkverkenner).