File size: 4,944 Bytes
353d486
 
246690c
 
353d486
 
246690c
353d486
 
246690c
 
 
 
 
353d486
 
246690c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
title: Jaccard Similarity
emoji: 🤗 
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false
tags:
- evaluate
- metric
description: >-
  Jaccard similarity coefficient score is defined as the size of the intersection divided by the size of the union of two label sets. It is used to compare the set of predicted labels for a sample to the corresponding set of true labels.
---

# Metric Card for Jaccard Similarity

## Metric Description

The Jaccard similarity coefficient score, also known as the Jaccard index, is defined as the size of the intersection divided by the size of the union of two label sets. It is used to compare the set of predicted labels for a sample to the corresponding set of true labels.

For binary classification, it can be computed as:
Jaccard = TP / (TP + FP + FN)
Where:
TP: True positive
FP: False positive
FN: False negative

The metric supports multiclass and multilabel classification by treating it as a collection of binary problems, one for each label.

## How to Use

At minimum, this metric requires predictions and references as inputs.

```python
>>> jaccard_metric = evaluate.load("jaccard_similarity")
>>> results = jaccard_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 2, 1, 0, 0, 1])
>>> print(results)
{'jaccard_similarity': 0.3333333333333333}
```

### Inputs
- **predictions** (`list` of `int` or `array-like` of shape (n_samples,) or (n_samples, n_classes)): Predicted labels or label indicators.
- **references** (`list` of `int` or `array-like` of shape (n_samples,) or (n_samples, n_classes)): Ground truth labels or label indicators.
- **average** (`string`, default='binary'): This parameter is required for multiclass/multilabel targets. Options are ['binary', 'micro', 'macro', 'samples', 'weighted', None].
- **labels** (`list` of `int`, default=None): The set of labels to include when `average != 'binary'`.
- **pos_label** (`int`, `float`, `bool` or `str`, default=1): The class to report if `average='binary'` and the data is binary.
- **sample_weight** (`list` of `float`, default=None): Sample weights.
- **zero_division** (`"warn"`, `0` or `1`, default="warn"): Sets the value to return when there is a zero division.

### Output Values
- **jaccard_similarity** (`float` or `ndarray` of `float64`): Jaccard similarity score. Minimum possible value is 0. Maximum possible value is 1.0. A higher score means higher similarity.

Output Example:
```python
{'jaccard_similarity': 0.3333333333333333}
```

This metric outputs a dictionary containing the Jaccard similarity score.

#### Values from Popular Papers

Jaccard similarity is often used in information retrieval and text similarity tasks. For example, it's used to evaluate the performance of named entity recognition systems or in plagiarism detection.

### Examples

Example 1 - Binary classification:
```python
>>> jaccard_metric = evaluate.load("jaccard_similarity")
>>> results = jaccard_metric.compute(references=[0, 1, 1, 1], predictions=[1, 1, 0, 1])
>>> print(results)
{'jaccard_similarity': 0.6666666666666666}
```

Example 2 - Multiclass classification:
```python
>>> jaccard_metric = evaluate.load("jaccard_similarity")
>>> results = jaccard_metric.compute(references=[0, 1, 2, 3], predictions=[0, 2, 1, 3], average='macro')
>>> print(results)
{'jaccard_similarity': 0.5}
```

Example 3 - Multilabel classification:
```python
>>> jaccard_metric = evaluate.load("jaccard_similarity")
>>> results = jaccard_metric.compute(
...     references=[[0, 1, 1], [0, 1, 1]],
...     predictions=[[1, 1, 0], [0, 1, 0]],
...     average='samples'
... )
>>> print(results)
{'jaccard_similarity': 0.41666666666666663}
```

## Limitations and Bias
Jaccard similarity may be a poor metric if there are no positives for some samples or classes. It is undefined if there are no true or predicted labels, and our implementation will return a score of 0 with a warning in such cases.

For imbalanced datasets, Jaccard similarity might not provide a complete picture of the model's performance. In such cases, it's often beneficial to use it in conjunction with other metrics like precision, recall, and F1-score.

## Citation
```bibtex
@article{scikit-learn,
  title={Scikit-learn: Machine Learning in {P}ython},
  author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
         and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
         and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
         Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
  journal={Journal of Machine Learning Research},
  volume={12},
  pages={2825--2830},
  year={2011}
}
```

## Further References
- [Scikit-learn documentation on Jaccard similarity score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_score.html)
- [Wikipedia entry for the Jaccard index](https://en.wikipedia.org/wiki/Jaccard_index)