NetherlandsForensicInstitute
/

vuurwerkverkenner

Model card Files Files and versions Community

tdirkse-nfi commited on Apr 25, 2024

Commit

c3eec12

·

verified ·

1 Parent(s): f0a644b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -84,7 +84,7 @@ In evaluating the performance of the model, we consider two factors as important
 * The model may encounter firework categories for which it has seen (real) snippets during training (‘best case’), or snippets for categories that are not present in the train set (‘worst case').
 To capture the difference in performance across conditions, we construct a separate test set for the lab snippets and the mock-crime scene snippets.
-For the lab snippets, we split the test set into two parts: one for which snippets are present in the test set (best-case) and a second part for which they are not (worst-case).
 As the mock-crime scene dataset only consists of 7 classes, we are unable to construct a worst-case test set – so we only report best-case performance for this dataset.
 In practice, a drop in performance may of course be expected in the worst-case scenario for (mock-)crime scene snippets.
 Overall, we find that the model performs very well for classes that are present in the train set, and that the text filter gives a significant boost if this is not the case.

 * The model may encounter firework categories for which it has seen (real) snippets during training (‘best case’), or snippets for categories that are not present in the train set (‘worst case').
 To capture the difference in performance across conditions, we construct a separate test set for the lab snippets and the mock-crime scene snippets.
+For the lab snippets, we split the test set into two parts: one for which categories are present in the train set (best-case) and a second part for which they are not (worst-case).
 As the mock-crime scene dataset only consists of 7 classes, we are unable to construct a worst-case test set – so we only report best-case performance for this dataset.
 In practice, a drop in performance may of course be expected in the worst-case scenario for (mock-)crime scene snippets.
 Overall, we find that the model performs very well for classes that are present in the train set, and that the text filter gives a significant boost if this is not the case.