Spaces:
Running
Running
Changed pluralization for mean keys
Browse files- README.md +9 -5
- phone_distance.py +6 -6
README.md
CHANGED
@@ -39,9 +39,9 @@ The computation returns a dictionary with the following key and values:
|
|
39 |
- **phone_error_rates** (`list` of `float`): Phone error rate (PER) gives edit distance in terms of phones for each prediction-reference pair, rather than Unicode characters, since phones can consist of multiple characters. It is normalized by the number of phones of the reference string. The result with have the same length as the input prediction and reference lists.
|
40 |
- **mean_phone_error_rate** (`float`): Overall mean of PER.
|
41 |
- **phone_feature_error_rates** (`list` of `float`): Phone feature error rate (PFER) is Levenshtein distance between strings where distance between individual phones is computed using Hamming distance between phonetic features for each prediction-reference pair. By default it is a metric that obeys the triangle equality, but can also be normalized by number of phones.
|
42 |
-
- **
|
43 |
- **feature_error_rates** (`list` of `float`): Feature error rate (FER) is the edit distance in terms of articulatory features normalized by the number of phones in the reference, computed for each prediction-reference pair.
|
44 |
-
- **
|
45 |
|
46 |
|
47 |
#### Values from Popular Papers
|
@@ -52,19 +52,23 @@ The computation returns a dictionary with the following key and values:
|
|
52 |
Simplest use case to compute phone error rates between two IPA strings:
|
53 |
```python
|
54 |
>>> phone_distance.compute(predictions=["bob", "ði", "spin"], references=["pop", "ðə", "spʰin"])
|
55 |
-
{'phone_error_rates': [0.6666666666666666, 0.5, 0.25], 'mean_phone_error_rate': 0.47222222222222215, 'phone_feature_error_rates': [0.08333333333333333, 0.125, 0.041666666666666664], '
|
56 |
```
|
57 |
|
58 |
Normalize phone feature error rate by the length of the reference string:
|
59 |
```python
|
60 |
>>> phone_distance.compute(predictions=["bob", "ði"], references=["pop", "ðə"], is_normalize_pfer=True)
|
61 |
-
{'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333,
|
|
|
|
|
62 |
```
|
63 |
|
64 |
Error rates may be greater than 1.0 if the reference string is shorter than the prediction string:
|
65 |
```python
|
66 |
>>> phone_distance.compute(predictions=["bob"], references=["po"])
|
67 |
-
{'phone_error_rates': [1.0], 'mean_phone_error_rate': 1.0,
|
|
|
|
|
68 |
```
|
69 |
|
70 |
Empty reference strings will cause an ValueError, you should handle them separately:
|
|
|
39 |
- **phone_error_rates** (`list` of `float`): Phone error rate (PER) gives edit distance in terms of phones for each prediction-reference pair, rather than Unicode characters, since phones can consist of multiple characters. It is normalized by the number of phones of the reference string. The result with have the same length as the input prediction and reference lists.
|
40 |
- **mean_phone_error_rate** (`float`): Overall mean of PER.
|
41 |
- **phone_feature_error_rates** (`list` of `float`): Phone feature error rate (PFER) is Levenshtein distance between strings where distance between individual phones is computed using Hamming distance between phonetic features for each prediction-reference pair. By default it is a metric that obeys the triangle equality, but can also be normalized by number of phones.
|
42 |
+
- **mean_phone_feature_error_rate** (`float`): Overall mean of PFER.
|
43 |
- **feature_error_rates** (`list` of `float`): Feature error rate (FER) is the edit distance in terms of articulatory features normalized by the number of phones in the reference, computed for each prediction-reference pair.
|
44 |
+
- **mean_feature_error_rate** (`float`): Overall mean of FER.
|
45 |
|
46 |
|
47 |
#### Values from Popular Papers
|
|
|
52 |
Simplest use case to compute phone error rates between two IPA strings:
|
53 |
```python
|
54 |
>>> phone_distance.compute(predictions=["bob", "ði", "spin"], references=["pop", "ðə", "spʰin"])
|
55 |
+
{'phone_error_rates': [0.6666666666666666, 0.5, 0.25], 'mean_phone_error_rate': 0.47222222222222215, 'phone_feature_error_rates': [0.08333333333333333, 0.125, 0.041666666666666664], 'mean_phone_feature_error_rate': 0.08333333333333333, 'feature_error_rates': [0.027777777777777776, 0.0625, 0.30208333333333337], 'mean_feature_error_rate': 0.13078703703703706}
|
56 |
```
|
57 |
|
58 |
Normalize phone feature error rate by the length of the reference string:
|
59 |
```python
|
60 |
>>> phone_distance.compute(predictions=["bob", "ði"], references=["pop", "ðə"], is_normalize_pfer=True)
|
61 |
+
{'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333,
|
62 |
+
'phone_feature_error_rates': [0.027777777777777776, 0.0625], 'mean_phone_feature_error_rate': 0.04513888888888889,
|
63 |
+
'feature_error_rates': [0.027777777777777776, 0.0625], 'mean_feature_error_rate': 0.04513888888888889}
|
64 |
```
|
65 |
|
66 |
Error rates may be greater than 1.0 if the reference string is shorter than the prediction string:
|
67 |
```python
|
68 |
>>> phone_distance.compute(predictions=["bob"], references=["po"])
|
69 |
+
{'phone_error_rates': [1.0], 'mean_phone_error_rate': 1.0,
|
70 |
+
'phone_feature_error_rates': [1.0416666666666667], 'mean_phone_feature_error_rate': 1.0416666666666667,
|
71 |
+
'feature_error_rates': [0.020833333333333332], 'mean_feature_error_rate': 0.020833333333333332}
|
72 |
```
|
73 |
|
74 |
Empty reference strings will cause an ValueError, you should handle them separately:
|
phone_distance.py
CHANGED
@@ -65,15 +65,15 @@ Returns:
|
|
65 |
phone_error_rates: list of floats giving PER for each prediction, reference pair
|
66 |
mean_phone_error_rate: float, average PER across all examples
|
67 |
phone_feature_error_rates: list of floats giving PFER for each prediction, reference pair
|
68 |
-
|
69 |
feature_error_rates: list of floats giving FER for each prediction, reference pair
|
70 |
-
|
71 |
|
72 |
Examples:
|
73 |
Compare articulatory differences in voicing in "bob" vs. "pop" and different pronunciations of "the":
|
74 |
>>> phone_distance = evaluate.load("ginic/phone_errors")
|
75 |
>>> phone_distance.compute(predictions=["bob", "ði"], references=["pop", "ðə"])
|
76 |
-
{'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333, 'phone_feature_error_rates': [0.08333333333333333, 0.125], '
|
77 |
|
78 |
Normalize PFER by the length of string with largest number of phones:
|
79 |
>>> phone_distance = evaluate.load("ginic/phone_errors")
|
@@ -147,7 +147,7 @@ class PhoneDistance(evaluate.Metric):
|
|
147 |
|
148 |
Returns:
|
149 |
dict: {"phone_error_rates": list[float], "mean_phone_error_rate": float, "phone_feature_error_rates": list[float], "mean_phone_feature_error_rates": float,
|
150 |
-
"feature_error_rates": list[float], "
|
151 |
"""
|
152 |
distance_computer = panphon.distance.Distance(feature_model=feature_model)
|
153 |
phone_error_rates = []
|
@@ -168,8 +168,8 @@ class PhoneDistance(evaluate.Metric):
|
|
168 |
"phone_error_rates": phone_error_rates,
|
169 |
"mean_phone_error_rate": np.mean(phone_error_rates),
|
170 |
"phone_feature_error_rates": hamming_distances,
|
171 |
-
"
|
172 |
"feature_error_rates": feature_error_rates,
|
173 |
-
"
|
174 |
}
|
175 |
|
|
|
65 |
phone_error_rates: list of floats giving PER for each prediction, reference pair
|
66 |
mean_phone_error_rate: float, average PER across all examples
|
67 |
phone_feature_error_rates: list of floats giving PFER for each prediction, reference pair
|
68 |
+
mean_phone_feature_error_rate: float, average PFER across all examples
|
69 |
feature_error_rates: list of floats giving FER for each prediction, reference pair
|
70 |
+
mean_feature_error_rate: float, average FER across all examples
|
71 |
|
72 |
Examples:
|
73 |
Compare articulatory differences in voicing in "bob" vs. "pop" and different pronunciations of "the":
|
74 |
>>> phone_distance = evaluate.load("ginic/phone_errors")
|
75 |
>>> phone_distance.compute(predictions=["bob", "ði"], references=["pop", "ðə"])
|
76 |
+
{'phone_error_rates': [0.6666666666666666, 0.5], 'mean_phone_error_rate': 0.5833333333333333, 'phone_feature_error_rates': [0.08333333333333333, 0.125], 'mean_phone_feature_error_rate': 0.10416666666666666, 'feature_error_rates': [0.027777777777777776, 0.0625], 'mean_feature_error_rate': 0.04513888888888889}
|
77 |
|
78 |
Normalize PFER by the length of string with largest number of phones:
|
79 |
>>> phone_distance = evaluate.load("ginic/phone_errors")
|
|
|
147 |
|
148 |
Returns:
|
149 |
dict: {"phone_error_rates": list[float], "mean_phone_error_rate": float, "phone_feature_error_rates": list[float], "mean_phone_feature_error_rates": float,
|
150 |
+
"feature_error_rates": list[float], "mean_feature_error_rate": float}
|
151 |
"""
|
152 |
distance_computer = panphon.distance.Distance(feature_model=feature_model)
|
153 |
phone_error_rates = []
|
|
|
168 |
"phone_error_rates": phone_error_rates,
|
169 |
"mean_phone_error_rate": np.mean(phone_error_rates),
|
170 |
"phone_feature_error_rates": hamming_distances,
|
171 |
+
"mean_phone_feature_error_rate": np.mean(hamming_distances),
|
172 |
"feature_error_rates": feature_error_rates,
|
173 |
+
"mean_feature_error_rate": np.mean(feature_error_rates)
|
174 |
}
|
175 |
|