Ruchin commited on
Commit
246690c
1 Parent(s): 353d486

Adding jaccard similarity

Browse files
Files changed (5) hide show
  1. README.md +110 -4
  2. app.py +6 -0
  3. jaccard_similarity.py +102 -0
  4. requirements.txt +1 -0
  5. tests.py +73 -0
README.md CHANGED
@@ -1,12 +1,118 @@
1
  ---
2
  title: Jaccard Similarity
3
- emoji: 📈
4
- colorFrom: red
5
  colorTo: red
6
  sdk: gradio
7
- sdk_version: 4.44.0
8
  app_file: app.py
9
  pinned: false
 
 
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Jaccard Similarity
3
+ emoji: 🤗
4
+ colorFrom: blue
5
  colorTo: red
6
  sdk: gradio
7
+ sdk_version: 3.19.1
8
  app_file: app.py
9
  pinned: false
10
+ tags:
11
+ - evaluate
12
+ - metric
13
+ description: >-
14
+ Jaccard similarity coefficient score is defined as the size of the intersection divided by the size of the union of two label sets. It is used to compare the set of predicted labels for a sample to the corresponding set of true labels.
15
  ---
16
 
17
+ # Metric Card for Jaccard Similarity
18
+
19
+ ## Metric Description
20
+
21
+ The Jaccard similarity coefficient score, also known as the Jaccard index, is defined as the size of the intersection divided by the size of the union of two label sets. It is used to compare the set of predicted labels for a sample to the corresponding set of true labels.
22
+
23
+ For binary classification, it can be computed as:
24
+ Jaccard = TP / (TP + FP + FN)
25
+ Where:
26
+ TP: True positive
27
+ FP: False positive
28
+ FN: False negative
29
+
30
+ The metric supports multiclass and multilabel classification by treating it as a collection of binary problems, one for each label.
31
+
32
+ ## How to Use
33
+
34
+ At minimum, this metric requires predictions and references as inputs.
35
+
36
+ ```python
37
+ >>> jaccard_metric = evaluate.load("jaccard_similarity")
38
+ >>> results = jaccard_metric.compute(references=[0, 1, 2, 0, 1, 2], predictions=[0, 2, 1, 0, 0, 1])
39
+ >>> print(results)
40
+ {'jaccard_similarity': 0.3333333333333333}
41
+ ```
42
+
43
+ ### Inputs
44
+ - **predictions** (`list` of `int` or `array-like` of shape (n_samples,) or (n_samples, n_classes)): Predicted labels or label indicators.
45
+ - **references** (`list` of `int` or `array-like` of shape (n_samples,) or (n_samples, n_classes)): Ground truth labels or label indicators.
46
+ - **average** (`string`, default='binary'): This parameter is required for multiclass/multilabel targets. Options are ['binary', 'micro', 'macro', 'samples', 'weighted', None].
47
+ - **labels** (`list` of `int`, default=None): The set of labels to include when `average != 'binary'`.
48
+ - **pos_label** (`int`, `float`, `bool` or `str`, default=1): The class to report if `average='binary'` and the data is binary.
49
+ - **sample_weight** (`list` of `float`, default=None): Sample weights.
50
+ - **zero_division** (`"warn"`, `0` or `1`, default="warn"): Sets the value to return when there is a zero division.
51
+
52
+ ### Output Values
53
+ - **jaccard_similarity** (`float` or `ndarray` of `float64`): Jaccard similarity score. Minimum possible value is 0. Maximum possible value is 1.0. A higher score means higher similarity.
54
+
55
+ Output Example:
56
+ ```python
57
+ {'jaccard_similarity': 0.3333333333333333}
58
+ ```
59
+
60
+ This metric outputs a dictionary containing the Jaccard similarity score.
61
+
62
+ #### Values from Popular Papers
63
+
64
+ Jaccard similarity is often used in information retrieval and text similarity tasks. For example, it's used to evaluate the performance of named entity recognition systems or in plagiarism detection.
65
+
66
+ ### Examples
67
+
68
+ Example 1 - Binary classification:
69
+ ```python
70
+ >>> jaccard_metric = evaluate.load("jaccard_similarity")
71
+ >>> results = jaccard_metric.compute(references=[0, 1, 1, 1], predictions=[1, 1, 0, 1])
72
+ >>> print(results)
73
+ {'jaccard_similarity': 0.6666666666666666}
74
+ ```
75
+
76
+ Example 2 - Multiclass classification:
77
+ ```python
78
+ >>> jaccard_metric = evaluate.load("jaccard_similarity")
79
+ >>> results = jaccard_metric.compute(references=[0, 1, 2, 3], predictions=[0, 2, 1, 3], average='macro')
80
+ >>> print(results)
81
+ {'jaccard_similarity': 0.5}
82
+ ```
83
+
84
+ Example 3 - Multilabel classification:
85
+ ```python
86
+ >>> jaccard_metric = evaluate.load("jaccard_similarity")
87
+ >>> results = jaccard_metric.compute(
88
+ ... references=[[0, 1, 1], [0, 1, 1]],
89
+ ... predictions=[[1, 1, 0], [0, 1, 0]],
90
+ ... average='samples'
91
+ ... )
92
+ >>> print(results)
93
+ {'jaccard_similarity': 0.41666666666666663}
94
+ ```
95
+
96
+ ## Limitations and Bias
97
+ Jaccard similarity may be a poor metric if there are no positives for some samples or classes. It is undefined if there are no true or predicted labels, and our implementation will return a score of 0 with a warning in such cases.
98
+
99
+ For imbalanced datasets, Jaccard similarity might not provide a complete picture of the model's performance. In such cases, it's often beneficial to use it in conjunction with other metrics like precision, recall, and F1-score.
100
+
101
+ ## Citation
102
+ ```bibtex
103
+ @article{scikit-learn,
104
+ title={Scikit-learn: Machine Learning in {P}ython},
105
+ author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
106
+ and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
107
+ and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
108
+ Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
109
+ journal={Journal of Machine Learning Research},
110
+ volume={12},
111
+ pages={2825--2830},
112
+ year={2011}
113
+ }
114
+ ```
115
+
116
+ ## Further References
117
+ - [Scikit-learn documentation on Jaccard similarity score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_score.html)
118
+ - [Wikipedia entry for the Jaccard index](https://en.wikipedia.org/wiki/Jaccard_index)
app.py ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ import evaluate
2
+ from evaluate.utils import launch_gradio_widget
3
+
4
+
5
+ module = evaluate.load("Ruchin/jaccard_similarity")
6
+ launch_gradio_widget(module)
jaccard_similarity.py ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2023 The HuggingFace Evaluate Authors.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+ """Jaccard similarity metric."""
15
+
16
+ import evaluate
17
+ import datasets
18
+ from sklearn.metrics import jaccard_score
19
+ import numpy as np
20
+
21
+
22
+ _CITATION = """\
23
+ @article{jaccard1912distribution,
24
+ title={The distribution of the flora in the alpine zone},
25
+ author={Jaccard, Paul},
26
+ journal={New phytologist},
27
+ volume={11},
28
+ number={2},
29
+ pages={37--50},
30
+ year={1912},
31
+ publisher={Wiley Online Library}
32
+ }
33
+ """
34
+
35
+ _DESCRIPTION = """\
36
+ Jaccard similarity is a statistic used for gauging the similarity and diversity of sample sets.
37
+ The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of
38
+ the intersection divided by the size of the union of the sample sets. This implementation uses
39
+ scikit-learn's jaccard_score function.
40
+ """
41
+
42
+ _KWARGS_DESCRIPTION = """
43
+ Calculates the Jaccard similarity between predictions and references using scikit-learn.
44
+ Args:
45
+ predictions: 1d array-like, or label indicator array / sparse matrix
46
+ Predicted labels, as returned by a classifier.
47
+ references: 1d array-like, or label indicator array / sparse matrix
48
+ Ground truth (correct) labels.
49
+ labels: array-like of shape (n_classes,), default=None
50
+ The set of labels to include when average != 'binary', and their order if average is None.
51
+ pos_label: int, float, bool or str, default=1
52
+ The class to report if average='binary' and the data is binary.
53
+ average: {'micro', 'macro', 'samples', 'weighted', 'binary'} or None, default='binary'
54
+ This parameter is required for multiclass/multilabel targets.
55
+ sample_weight: array-like of shape (n_samples,), default=None
56
+ Sample weights.
57
+ zero_division: "warn", {0.0, 1.0}, default="warn"
58
+ Sets the value to return when there is a zero division.
59
+ Returns:
60
+ jaccard_similarity: float or ndarray of shape (n_unique_labels,)
61
+ Jaccard similarity score.
62
+ Examples:
63
+ >>> jaccard_metric = evaluate.load("jaccard_similarity")
64
+ >>> predictions = [0, 2, 1, 3]
65
+ >>> references = [0, 1, 2, 3]
66
+ >>> results = jaccard_metric.compute(predictions=predictions, references=references, average='macro')
67
+ >>> print(results)
68
+ {'jaccard_similarity': 0.5}
69
+ """
70
+
71
+
72
+ @evaluate.utils.file_utils.add_start_docstrings(_DESCRIPTION, _KWARGS_DESCRIPTION)
73
+ class JaccardSimilarity(evaluate.Metric):
74
+ def _info(self):
75
+ return evaluate.MetricInfo(
76
+ module_type="metric",
77
+ description=_DESCRIPTION,
78
+ citation=_CITATION,
79
+ inputs_description=_KWARGS_DESCRIPTION,
80
+ features=datasets.Features({
81
+ 'predictions': datasets.Value('int32'),
82
+ 'references': datasets.Value('int32'),
83
+ }),
84
+ reference_urls=[
85
+ "https://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_score.html",
86
+ "https://en.wikipedia.org/wiki/Jaccard_index"
87
+ ],
88
+ )
89
+
90
+ def _compute(self, predictions, references, labels=None, pos_label=1, average='binary', sample_weight=None, zero_division='warn'):
91
+ """Returns the Jaccard similarity score using scikit-learn"""
92
+ return {
93
+ "jaccard_similarity": jaccard_score(
94
+ references,
95
+ predictions,
96
+ labels=labels,
97
+ pos_label=pos_label,
98
+ average=average,
99
+ sample_weight=sample_weight,
100
+ zero_division=zero_division
101
+ )
102
+ }
requirements.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ git+https://github.com/huggingface/evaluate@main
tests.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ test_cases = [
2
+ {
3
+ "predictions": [0, 0],
4
+ "references": [1, 1],
5
+ "result": {"jaccard_similarity": 0.0}
6
+ },
7
+ {
8
+ "predictions": [1, 1],
9
+ "references": [1, 1],
10
+ "result": {"jaccard_similarity": 1.0}
11
+ },
12
+ {
13
+ "predictions": [1, 0],
14
+ "references": [1, 1],
15
+ "result": {"jaccard_similarity": 0.5}
16
+ },
17
+ {
18
+ "predictions": [0, 1, 2, 3],
19
+ "references": [0, 1, 2, 3],
20
+ "result": {"jaccard_similarity": 1.0}
21
+ },
22
+ {
23
+ "predictions": [0, 1, 2, 3],
24
+ "references": [3, 2, 1, 0],
25
+ "result": {"jaccard_similarity": 1.0}
26
+ },
27
+ {
28
+ "predictions": [0, 0, 1, 1],
29
+ "references": [1, 1, 1, 1],
30
+ "result": {"jaccard_similarity": 0.5}
31
+ },
32
+ {
33
+ "predictions": [0, 0, 0, 1],
34
+ "references": [1, 1, 1, 1],
35
+ "result": {"jaccard_similarity": 0.25}
36
+ },
37
+ {
38
+ "predictions": [0, 1, 2, 3],
39
+ "references": [0, 1, 2, 3],
40
+ "result": {"jaccard_similarity": 1.0},
41
+ "kwargs": {"average": "macro"}
42
+ },
43
+ {
44
+ "predictions": [0, 0, 1, 1],
45
+ "references": [1, 1, 1, 1],
46
+ "result": {"jaccard_similarity": 0.5},
47
+ "kwargs": {"average": "binary", "pos_label": 1}
48
+ },
49
+ {
50
+ "predictions": [0, 1, 2, 0],
51
+ "references": [0, 1, 1, 2],
52
+ "result": {"jaccard_similarity": 0.375},
53
+ "kwargs": {"average": "weighted"}
54
+ },
55
+ {
56
+ "predictions": [[1, 1, 1], [1, 0, 0]],
57
+ "references": [[0, 1, 1], [1, 1, 0]],
58
+ "result": {"jaccard_similarity": 0.5833333333333334},
59
+ "kwargs": {"average": "samples"}
60
+ },
61
+ {
62
+ "predictions": [[1, 1, 1], [1, 0, 0]],
63
+ "references": [[0, 1, 1], [1, 1, 0]],
64
+ "result": {"jaccard_similarity": 0.6666666666666666},
65
+ "kwargs": {"average": "macro"}
66
+ },
67
+ {
68
+ "predictions": [[1, 1, 1], [1, 0, 0]],
69
+ "references": [[0, 1, 1], [1, 1, 0]],
70
+ "result": {"jaccard_similarity": [0.5, 0.5, 1.0]},
71
+ "kwargs": {"average": None}
72
+ },
73
+ ]