Spaces:

evaluate-metric
/

exact_match

App Files Files Community

lvwerra HF staff commited on May 30, 2022

Commit

d428842

•

1 Parent(s): b77696d

Update Space (evaluate main: e51c679b)

Browse files

Files changed (2) hide show

README.md +13 -13
exact_match.py +12 -13

README.md CHANGED Viewed

@@ -29,7 +29,7 @@ The exact match score of a set of predictions is the sum of all of the individua
 ## How to Use
 At minimum, this metric takes as input predictions and references:
 ```python
->>> from datasets import load
 >>> exact_match_metric = load("exact_match")
 >>> results = exact_match_metric.compute(predictions=predictions, references=references)
 ```
@@ -47,10 +47,10 @@ At minimum, this metric takes as input predictions and references:
 This metric outputs a dictionary with one value: the average exact match score.
 ```python
-{'exact_match': 100.0}
 ```
-This metric's range is 0-100, inclusive. Here, 0.0 means no prediction/reference pairs were matches, while 100.0 means they all were.
 #### Values from Popular Papers
 The exact match metric is often included in other metrics, such as SQuAD. For example, the [original SQuAD paper](https://nlp.stanford.edu/pubs/rajpurkar2016squad.pdf) reported an Exact Match score of 40.0%. They also report that the human performance Exact Match score on the dataset was 80.3%.
@@ -62,8 +62,8 @@ Without including any regexes to ignore:
 >>> refs = ["the cat", "theater", "YELLING", "agent007"]
 >>> preds = ["cat?", "theater", "yelling", "agent"]
 >>> results = exact_match.compute(references=refs, predictions=preds)
->>> print(round(results["exact_match"], 1))
-25.0
 ```
 Ignoring regexes "the" and "yell", as well as ignoring case and punctuation:
@@ -72,8 +72,8 @@ Ignoring regexes "the" and "yell", as well as ignoring case and punctuation:
 >>> refs = ["the cat", "theater", "YELLING", "agent007"]
 >>> preds = ["cat?", "theater", "yelling", "agent"]
 >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell"], ignore_case=True, ignore_punctuation=True)
->>> print(round(results["exact_match"], 1))
-50.0
 ```
 Note that in the example above, because the regexes are ignored before the case is normalized, "yell" from "YELLING" is not deleted.
@@ -83,8 +83,8 @@ Ignoring "the", "yell", and "YELL", as well as ignoring case and punctuation:
 >>> refs = ["the cat", "theater", "YELLING", "agent007"]
 >>> preds = ["cat?", "theater", "yelling", "agent"]
 >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True)
->>> print(round(results["exact_match"], 1))
-75.0
 ```
 Ignoring "the", "yell", and "YELL", as well as ignoring case, punctuation, and numbers:
@@ -93,8 +93,8 @@ Ignoring "the", "yell", and "YELL", as well as ignoring case, punctuation, and n
 >>> refs = ["the cat", "theater", "YELLING", "agent007"]
 >>> preds = ["cat?", "theater", "yelling", "agent"]
 >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True, ignore_numbers=True)
->>> print(round(results["exact_match"], 1))
-100.0
 ```
 An example that includes sentences:
@@ -103,8 +103,8 @@ An example that includes sentences:
 >>> refs = ["The cat sat on the mat.", "Theaters are great.", "It's like comparing oranges and apples."]
 >>> preds = ["The cat sat on the mat?", "Theaters are great.", "It's like comparing apples and oranges."]
 >>> results = exact_match.compute(references=refs, predictions=preds)
->>> print(round(results["exact_match"], 1))
-33.3
 ```

 ## How to Use
 At minimum, this metric takes as input predictions and references:
 ```python
+>>> from evaluate import load
 >>> exact_match_metric = load("exact_match")
 >>> results = exact_match_metric.compute(predictions=predictions, references=references)
 ```
 This metric outputs a dictionary with one value: the average exact match score.
 ```python
+{'exact_match': 1.0}
 ```
+This metric's range is 0-1, inclusive. Here, 0.0 means no prediction/reference pairs were matches, while 1.0 means they all were.
 #### Values from Popular Papers
 The exact match metric is often included in other metrics, such as SQuAD. For example, the [original SQuAD paper](https://nlp.stanford.edu/pubs/rajpurkar2016squad.pdf) reported an Exact Match score of 40.0%. They also report that the human performance Exact Match score on the dataset was 80.3%.
 >>> refs = ["the cat", "theater", "YELLING", "agent007"]
 >>> preds = ["cat?", "theater", "yelling", "agent"]
 >>> results = exact_match.compute(references=refs, predictions=preds)
+>>> print(round(results["exact_match"], 2))
+0.25
 ```
 Ignoring regexes "the" and "yell", as well as ignoring case and punctuation:
 >>> refs = ["the cat", "theater", "YELLING", "agent007"]
 >>> preds = ["cat?", "theater", "yelling", "agent"]
 >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell"], ignore_case=True, ignore_punctuation=True)
+>>> print(round(results["exact_match"], 2))
+0.5
 ```
 Note that in the example above, because the regexes are ignored before the case is normalized, "yell" from "YELLING" is not deleted.
 >>> refs = ["the cat", "theater", "YELLING", "agent007"]
 >>> preds = ["cat?", "theater", "yelling", "agent"]
 >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True)
+>>> print(round(results["exact_match"], 2))
+0.75
 ```
 Ignoring "the", "yell", and "YELL", as well as ignoring case, punctuation, and numbers:
 >>> refs = ["the cat", "theater", "YELLING", "agent007"]
 >>> preds = ["cat?", "theater", "yelling", "agent"]
 >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True, ignore_numbers=True)
+>>> print(round(results["exact_match"], 2))
+1.0
 ```
 An example that includes sentences:
 >>> refs = ["The cat sat on the mat.", "Theaters are great.", "It's like comparing oranges and apples."]
 >>> preds = ["The cat sat on the mat?", "Theaters are great.", "It's like comparing apples and oranges."]
 >>> results = exact_match.compute(references=refs, predictions=preds)
+>>> print(round(results["exact_match"], 2))
+0.33
 ```

exact_match.py CHANGED Viewed

@@ -40,44 +40,43 @@ Args:
     ignore_numbers: Boolean, defaults to False. If true, removes all punctuation before
         comparing predictions and references.
 Returns:
-    exact_match: Dictionary containing exact_match rate. Possible values are between 0.0 and 100.0, inclusive.
 Examples:
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["the cat", "theater", "YELLING", "agent007"]
     >>> preds = ["cat?", "theater", "yelling", "agent"]
     >>> results = exact_match.compute(references=refs, predictions=preds)
-    >>> print(round(results["exact_match"], 1))
-    25.0
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["the cat", "theater", "YELLING", "agent007"]
     >>> preds = ["cat?", "theater", "yelling", "agent"]
     >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell"], ignore_case=True, ignore_punctuation=True)
-    >>> print(round(results["exact_match"], 1))
-    50.0
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["the cat", "theater", "YELLING", "agent007"]
     >>> preds = ["cat?", "theater", "yelling", "agent"]
     >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True)
-    >>> print(round(results["exact_match"], 1))
-    75.0
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["the cat", "theater", "YELLING", "agent007"]
     >>> preds = ["cat?", "theater", "yelling", "agent"]
     >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True, ignore_numbers=True)
-    >>> print(round(results["exact_match"], 1))
-    100.0
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["The cat sat on the mat.", "Theaters are great.", "It's like comparing oranges and apples."]
     >>> preds = ["The cat sat on the mat?", "Theaters are great.", "It's like comparing apples and oranges."]
     >>> results = exact_match.compute(references=refs, predictions=preds)
-    >>> print(round(results["exact_match"], 1))
-    33.3
 """
 _CITATION = """
@@ -134,4 +133,4 @@ class ExactMatch(evaluate.EvaluationModule):
         score_list = predictions == references
-        return {"exact_match": np.mean(score_list) * 100}

     ignore_numbers: Boolean, defaults to False. If true, removes all punctuation before
         comparing predictions and references.
 Returns:
+    exact_match: Dictionary containing exact_match rate. Possible values are between 0.0 and 1.0, inclusive.
 Examples:
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["the cat", "theater", "YELLING", "agent007"]
     >>> preds = ["cat?", "theater", "yelling", "agent"]
     >>> results = exact_match.compute(references=refs, predictions=preds)
+    >>> print(round(results["exact_match"], 2))
+    0.25
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["the cat", "theater", "YELLING", "agent007"]
     >>> preds = ["cat?", "theater", "yelling", "agent"]
     >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell"], ignore_case=True, ignore_punctuation=True)
+    >>> print(round(results["exact_match"], 2))
+    0.5
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["the cat", "theater", "YELLING", "agent007"]
     >>> preds = ["cat?", "theater", "yelling", "agent"]
     >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True)
+    >>> print(round(results["exact_match"], 2))
+    0.75
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["the cat", "theater", "YELLING", "agent007"]
     >>> preds = ["cat?", "theater", "yelling", "agent"]
     >>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True, ignore_numbers=True)
+    >>> print(round(results["exact_match"], 2))
+    1.0
     >>> exact_match = evaluate.load("exact_match")
     >>> refs = ["The cat sat on the mat.", "Theaters are great.", "It's like comparing oranges and apples."]
     >>> preds = ["The cat sat on the mat?", "Theaters are great.", "It's like comparing apples and oranges."]
     >>> results = exact_match.compute(references=refs, predictions=preds)
+    >>> print(round(results["exact_match"], 2))
+    0.33
 """
 _CITATION = """
         score_list = predictions == references
+        return {"exact_match": np.mean(score_list)}