ginic commited on
Commit
14b55be
1 Parent(s): 2b6b085

Changed pluralization for mean keys

Browse files
Files changed (2) hide show
  1. README.md +9 -5
  2. phone_distance.py +6 -6
README.md CHANGED
@@ -39,9 +39,9 @@ The computation returns a dictionary with the following key and values:
39
  - **phone_error_rates** (`list` of `float`): Phone error rate (PER) gives edit distance in terms of phones for each prediction-reference pair, rather than Unicode characters, since phones can consist of multiple characters. It is normalized by the number of phones of the reference string. The result with have the same length as the input prediction and reference lists.
40
  - **mean_phone_error_rate** (`float`): Overall mean of PER.
41
  - **phone_feature_error_rates** (`list` of `float`): Phone feature error rate (PFER) is Levenshtein distance between strings where distance between individual phones is computed using Hamming distance between phonetic features for each prediction-reference pair. By default it is a metric that obeys the triangle equality, but can also be normalized by number of phones.
42
- - **mean_phone_feature_error_rates** (`float`): Overall mean of PFER.
43
  - **feature_error_rates** (`list` of `float`): Feature error rate (FER) is the edit distance in terms of articulatory features normalized by the number of phones in the reference, computed for each prediction-reference pair.
44
- - **mean_feature_error_rates** (`float`): Overall mean of FER.
45
 
46
 
47
  #### Values from Popular Papers
@@ -52,19 +52,23 @@ The computation returns a dictionary with the following key and values:
52
  Simplest use case to compute phone error rates between two IPA strings:
53
  ```python
54
  >>> phone_distance.compute(predictions=["bob", "ði", "spin"], references=["pop", "ðə", "spʰin"])
55
- {'phone_error_rates': [0.6666666666666666, 0.5, 0.25], 'mean_phone_error_rate': 0.47222222222222215, 'phone_feature_error_rates': [0.08333333333333333, 0.125, 0.041666666666666664], 'mean_phone_feature_error_rates': 0.08333333333333333, 'feature_error_rates': [0.027777777777777776, 0.0625, 0.30208333333333337], 'mean_feature_error_rates': 0.13078703703703706}
56
  ```
57
 
58
  Normalize phone feature error rate by the length of the reference string:
59
  ```python
60
  >>> phone_distance.compute(predictions=["bob", "ði"], references=["pop", "ðə"], is_normalize_pfer=True)
61
- {'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333, 'phone_feature_error_rates': [0.027777777777777776, 0.0625], 'mean_phone_feature_error_rates': 0.04513888888888889, 'feature_error_rates': [0.027777777777777776, 0.0625], 'mean_feature_error_rates': 0.04513888888888889}
 
 
62
  ```
63
 
64
  Error rates may be greater than 1.0 if the reference string is shorter than the prediction string:
65
  ```python
66
  >>> phone_distance.compute(predictions=["bob"], references=["po"])
67
- {'phone_error_rates': [1.0], 'mean_phone_error_rate': 1.0, 'phone_feature_error_rates': [1.0416666666666667], 'mean_phone_feature_error_rates': 1.0416666666666667, 'feature_error_rates': [0.020833333333333332], 'mean_feature_error_rates': 0.020833333333333332}
 
 
68
  ```
69
 
70
  Empty reference strings will cause an ValueError, you should handle them separately:
 
39
  - **phone_error_rates** (`list` of `float`): Phone error rate (PER) gives edit distance in terms of phones for each prediction-reference pair, rather than Unicode characters, since phones can consist of multiple characters. It is normalized by the number of phones of the reference string. The result with have the same length as the input prediction and reference lists.
40
  - **mean_phone_error_rate** (`float`): Overall mean of PER.
41
  - **phone_feature_error_rates** (`list` of `float`): Phone feature error rate (PFER) is Levenshtein distance between strings where distance between individual phones is computed using Hamming distance between phonetic features for each prediction-reference pair. By default it is a metric that obeys the triangle equality, but can also be normalized by number of phones.
42
+ - **mean_phone_feature_error_rate** (`float`): Overall mean of PFER.
43
  - **feature_error_rates** (`list` of `float`): Feature error rate (FER) is the edit distance in terms of articulatory features normalized by the number of phones in the reference, computed for each prediction-reference pair.
44
+ - **mean_feature_error_rate** (`float`): Overall mean of FER.
45
 
46
 
47
  #### Values from Popular Papers
 
52
  Simplest use case to compute phone error rates between two IPA strings:
53
  ```python
54
  >>> phone_distance.compute(predictions=["bob", "ði", "spin"], references=["pop", "ðə", "spʰin"])
55
+ {'phone_error_rates': [0.6666666666666666, 0.5, 0.25], 'mean_phone_error_rate': 0.47222222222222215, 'phone_feature_error_rates': [0.08333333333333333, 0.125, 0.041666666666666664], 'mean_phone_feature_error_rate': 0.08333333333333333, 'feature_error_rates': [0.027777777777777776, 0.0625, 0.30208333333333337], 'mean_feature_error_rate': 0.13078703703703706}
56
  ```
57
 
58
  Normalize phone feature error rate by the length of the reference string:
59
  ```python
60
  >>> phone_distance.compute(predictions=["bob", "ði"], references=["pop", "ðə"], is_normalize_pfer=True)
61
+ {'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333,
62
+ 'phone_feature_error_rates': [0.027777777777777776, 0.0625], 'mean_phone_feature_error_rate': 0.04513888888888889,
63
+ 'feature_error_rates': [0.027777777777777776, 0.0625], 'mean_feature_error_rate': 0.04513888888888889}
64
  ```
65
 
66
  Error rates may be greater than 1.0 if the reference string is shorter than the prediction string:
67
  ```python
68
  >>> phone_distance.compute(predictions=["bob"], references=["po"])
69
+ {'phone_error_rates': [1.0], 'mean_phone_error_rate': 1.0,
70
+ 'phone_feature_error_rates': [1.0416666666666667], 'mean_phone_feature_error_rate': 1.0416666666666667,
71
+ 'feature_error_rates': [0.020833333333333332], 'mean_feature_error_rate': 0.020833333333333332}
72
  ```
73
 
74
  Empty reference strings will cause an ValueError, you should handle them separately:
phone_distance.py CHANGED
@@ -65,15 +65,15 @@ Returns:
65
  phone_error_rates: list of floats giving PER for each prediction, reference pair
66
  mean_phone_error_rate: float, average PER across all examples
67
  phone_feature_error_rates: list of floats giving PFER for each prediction, reference pair
68
- mean_phone_feature_error_rates: float, average PFER across all examples
69
  feature_error_rates: list of floats giving FER for each prediction, reference pair
70
- mean_feature_error_rates: float, average FER across all examples
71
 
72
  Examples:
73
  Compare articulatory differences in voicing in "bob" vs. "pop" and different pronunciations of "the":
74
  >>> phone_distance = evaluate.load("ginic/phone_errors")
75
  >>> phone_distance.compute(predictions=["bob", "ði"], references=["pop", "ðə"])
76
- {'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333, 'phone_feature_error_rates': [0.08333333333333333, 0.125], 'mean_phone_feature_error_rates': 0.10416666666666666, 'feature_error_rates': [0.027777777777777776, 0.0625], 'mean_feature_error_rates': 0.04513888888888889}
77
 
78
  Normalize PFER by the length of string with largest number of phones:
79
  >>> phone_distance = evaluate.load("ginic/phone_errors")
@@ -147,7 +147,7 @@ class PhoneDistance(evaluate.Metric):
147
 
148
  Returns:
149
  dict: {"phone_error_rates": list[float], "mean_phone_error_rate": float, "phone_feature_error_rates": list[float], "mean_phone_feature_error_rates": float,
150
- "feature_error_rates": list[float], "mean_feature_error_rates": float}
151
  """
152
  distance_computer = panphon.distance.Distance(feature_model=feature_model)
153
  phone_error_rates = []
@@ -168,8 +168,8 @@ class PhoneDistance(evaluate.Metric):
168
  "phone_error_rates": phone_error_rates,
169
  "mean_phone_error_rate": np.mean(phone_error_rates),
170
  "phone_feature_error_rates": hamming_distances,
171
- "mean_phone_feature_error_rates": np.mean(hamming_distances),
172
  "feature_error_rates": feature_error_rates,
173
- "mean_feature_error_rates": np.mean(feature_error_rates)
174
  }
175
 
 
65
  phone_error_rates: list of floats giving PER for each prediction, reference pair
66
  mean_phone_error_rate: float, average PER across all examples
67
  phone_feature_error_rates: list of floats giving PFER for each prediction, reference pair
68
+ mean_phone_feature_error_rate: float, average PFER across all examples
69
  feature_error_rates: list of floats giving FER for each prediction, reference pair
70
+ mean_feature_error_rate: float, average FER across all examples
71
 
72
  Examples:
73
  Compare articulatory differences in voicing in "bob" vs. "pop" and different pronunciations of "the":
74
  >>> phone_distance = evaluate.load("ginic/phone_errors")
75
  >>> phone_distance.compute(predictions=["bob", "ði"], references=["pop", "ðə"])
76
+ {'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333, 'phone_feature_error_rates': [0.08333333333333333, 0.125], 'mean_phone_feature_error_rate': 0.10416666666666666, 'feature_error_rates': [0.027777777777777776, 0.0625], 'mean_feature_error_rate': 0.04513888888888889}
77
 
78
  Normalize PFER by the length of string with largest number of phones:
79
  >>> phone_distance = evaluate.load("ginic/phone_errors")
 
147
 
148
  Returns:
149
  dict: {"phone_error_rates": list[float], "mean_phone_error_rate": float, "phone_feature_error_rates": list[float], "mean_phone_feature_error_rates": float,
150
+ "feature_error_rates": list[float], "mean_feature_error_rate": float}
151
  """
152
  distance_computer = panphon.distance.Distance(feature_model=feature_model)
153
  phone_error_rates = []
 
168
  "phone_error_rates": phone_error_rates,
169
  "mean_phone_error_rate": np.mean(phone_error_rates),
170
  "phone_feature_error_rates": hamming_distances,
171
+ "mean_phone_feature_error_rate": np.mean(hamming_distances),
172
  "feature_error_rates": feature_error_rates,
173
+ "mean_feature_error_rate": np.mean(feature_error_rates)
174
  }
175