Spaces:
Sleeping
Sleeping
File size: 14,172 Bytes
cd18dd0 d85d83b 8f3e4ca cd18dd0 851133a cd18dd0 d85d83b 774aee4 d85d83b 3a2569c fcc706c 3a2569c fcc706c 3a2569c d85d83b 3a2569c efa4c13 e391132 774aee4 fcc706c d85d83b efa4c13 d85d83b 774aee4 fcc706c 774aee4 fcc706c efa4c13 d85d83b fcc706c efa4c13 d85d83b efa4c13 d85d83b fcc706c efa4c13 d85d83b fcc706c efa4c13 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
---
title: matching_series
tags:
- evaluate
- metric
description: "Matching-based time-series generation metric"
sdk: gradio
sdk_version: 3.50
app_file: app.py
pinned: false
---
# Metric Card for matching_series
## Metric Description
Matching Series is a metric for evaluating time-series generation models. It is based on the idea of matching the generated time-series with the original time-series. The metric calculates the Mean Squared Error (distance) between the generated time-series and the original time-series between matched instances. The metric outputs a score greater or equal to 0, where 0 indicates a perfect generation.
## How to Use
At minium, the metric requires the original time-series and the generated time-series as input. The metric can be used to evaluate the performance of time-series generation models.
```python
>>> num_generation = 100
>>> num_reference = 10
>>> seq_len = 100
>>> num_features = 10
>>> references = np.random.rand(num_reference, seq_len, num_features)
>>> predictions = np.random.rand(num_generation, seq_len, num_features)
>>> metric = evaluate.load("bowdbeg/matching_series")
>>> results = metric.compute(references=references, predictions=predictions, batch_size=1000, return_all=True)
>>> print(results)
{'precision_distance': 0.1573285013437271, 'recall_distance': 0.15106813609600067, 'mean_distance': 0.1541983187198639, 'index_distance': 0.16858606040477753, 'matching_precision': 0.06, 'matching_recall': 1.0, 'matching_f1': 0.11320756503381972, 'cuc': 0.12428571428571429, 'macro_precision_distance': 0.13803552389144896, 'macro_recall_distance': 0.12179495096206665, 'macro_mean_distance': 0.1299152374267578, 'macro_index_distance': 0.16858604848384856, 'macro_matching_precision': 0.094, 'macro_matching_recall': 0.97, 'macro_matching_f1': 0.17132608782381706, 'macro_cuc': 0.11419285714285714, 'distance': array([[[0.20763363, 0.16514072, 0.18695284, ..., 0.15037987,
0.19424284, 0.15943716],
[0.17150438, 0.18020014, 0.17024504, ..., 0.18492931,
0.18814348, 0.204207 ],
[0.1769202 , 0.15609328, 0.17568389, ..., 0.17731658,
0.2027854 , 0.13216409],
...,
[0.1838122 , 0.19475608, 0.14176111, ..., 0.1635111 ,
0.1652672 , 0.17145865],
[0.16084194, 0.14208058, 0.17567575, ..., 0.15595785,
0.16614595, 0.17834347],
[0.16388315, 0.14126392, 0.18021484, ..., 0.16791071,
0.18403953, 0.16666758]],
[[0.16838932, 0.18878576, 0.17654441, ..., 0.1747057 ,
0.16590554, 0.16901629],
[0.16553226, 0.1882645 , 0.17863466, ..., 0.19269662,
0.20451452, 0.19941731],
[0.16502398, 0.16619626, 0.18069996, ..., 0.16124909,
0.18933088, 0.1495165 ],
...,
[0.15946846, 0.19988221, 0.17965002, ..., 0.12951666,
0.2067793 , 0.13811146],
[0.16227122, 0.17736743, 0.18641905, ..., 0.15038314,
0.20186146, 0.17849396],
[0.16410898, 0.18323919, 0.16945514, ..., 0.15783694,
0.21556957, 0.17172968]],
[[0.18094379, 0.1364854 , 0.18436092, ..., 0.187335 ,
0.16240291, 0.13713893],
[0.18005298, 0.15323727, 0.15788248, ..., 0.19451861,
0.12822135, 0.14064161],
[0.1564556 , 0.17312287, 0.1856657 , ..., 0.17237219,
0.1596888 , 0.16547912],
...,
[0.15611127, 0.16121496, 0.15533476, ..., 0.16520709,
0.1427248 , 0.19455005],
[0.17268528, 0.17360437, 0.15962966, ..., 0.18134868,
0.15509704, 0.20222983],
[0.18704675, 0.15934442, 0.14928888, ..., 0.18904984,
0.16192877, 0.18576236]],
...,
[[0.13717972, 0.15645625, 0.16123378, ..., 0.19453087,
0.14441733, 0.1487963 ],
[0.1454296 , 0.13368016, 0.18665504, ..., 0.16096605,
0.15130125, 0.18332979],
[0.14654924, 0.19097947, 0.19629759, ..., 0.15887487,
0.19266474, 0.17430782],
...,
[0.161704 , 0.16357127, 0.18512094, ..., 0.16441964,
0.13961458, 0.17298506],
[0.1366249 , 0.15852758, 0.1982772 , ..., 0.18822236,
0.16153064, 0.19617072],
[0.14570995, 0.15005183, 0.19667573, ..., 0.1856473 ,
0.18603194, 0.19179863]],
[[0.17813908, 0.176182 , 0.16847256, ..., 0.16903524,
0.17150073, 0.15068175],
[0.17632519, 0.1404587 , 0.16388708, ..., 0.16873878,
0.15744762, 0.198475 ],
[0.14986345, 0.1517829 , 0.17624639, ..., 0.18365957,
0.17399347, 0.15581599],
...,
[0.16128553, 0.1974935 , 0.13766351, ..., 0.14026196,
0.15450196, 0.16110381],
[0.16281141, 0.14699166, 0.16935429, ..., 0.1394466 ,
0.1717883 , 0.16191883],
[0.14886455, 0.1603608 , 0.15172943, ..., 0.12851712,
0.19859877, 0.15576601]],
[[0.20230632, 0.19680001, 0.17143433, ..., 0.18601838,
0.15998998, 0.16043548],
[0.19753966, 0.19073424, 0.15046756, ..., 0.18833323,
0.16755773, 0.20127842],
[0.16012056, 0.16638812, 0.16493171, ..., 0.15849902,
0.20269662, 0.1857642 ],
...,
[0.16341361, 0.19168772, 0.16597596, ..., 0.15715535,
0.18122095, 0.17266828],
[0.1570099 , 0.18294124, 0.16713732, ..., 0.17442709,
0.17020254, 0.18804537],
[0.16752282, 0.1295177 , 0.18792175, ..., 0.13976808,
0.21054329, 0.18118018]]], dtype=float32), 'match': array([4, 7, 3, 9, 4, 0, 7, 5, 4, 7, 9, 7, 7, 5, 7, 0, 0, 7, 4, 3, 3, 2,
8, 9, 4, 4, 5, 1, 4, 9, 0, 2, 7, 3, 6, 5, 6, 3, 2, 2, 2, 6, 9, 4,
4, 9, 1, 6, 0, 6, 9, 2, 0, 6, 7, 2, 0, 4, 5, 2, 3, 9, 2, 3, 9, 1,
6, 4, 8, 9, 7, 4, 6, 5, 5, 6, 9, 5, 6, 2, 9, 4, 9, 3, 2, 9, 9, 7,
9, 5, 9, 1, 7, 6, 4, 4, 5, 4, 7, 5]), 'match_inv': array([15, 91, 79, 4, 4, 4, 49, 4, 49, 45]), 'coverages': [0.10000000000000002, 0.16666666666666666, 0.3666666666666667, 0.6333333333333333, 0.8333333333333334, 0.9, 1.0], 'precision_distance_features': [0.1383965164422989, 0.13804036378860474, 0.1388234943151474, 0.1392393559217453, 0.1357768476009369, 0.1364508718252182, 0.14039862155914307, 0.13417008519172668, 0.1368638128042221, 0.14219526946544647], 'recall_distance_features': [0.11730053275823593, 0.12232911586761475, 0.12200610339641571, 0.12571024894714355, 0.12081331014633179, 0.11693283170461655, 0.12660981714725494, 0.12248671054840088, 0.11726576089859009, 0.12649507820606232], 'mean_distance_features': [0.1278485246002674, 0.13018473982810974, 0.13041479885578156, 0.13247480243444443, 0.12829507887363434, 0.12669185176491737, 0.133504219353199, 0.12832839787006378, 0.1270647868514061, 0.1343451738357544], 'index_distance_features': [0.17064405977725983, 0.17019756138324738, 0.17373089492321014, 0.17575454711914062, 0.15942324697971344, 0.1615942418575287, 0.16519878804683685, 0.1714271903038025, 0.17072594165802002, 0.16716401278972626], 'matching_precision_features': [0.1, 0.09, 0.1, 0.1, 0.09, 0.09, 0.1, 0.08, 0.09, 0.1], 'matching_recall_features': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.9, 0.9, 0.9, 1.0], 'matching_f1_features': [0.18181819851239656, 0.16513763164885095, 0.18181819851239656, 0.18181819851239656, 0.16513763164885095, 0.16513763164885095, 0.18000001639999985, 0.14693879251145342, 0.16363638033057834, 0.18181819851239656], 'cuc_features': [0.11935714285714286, 0.11578571428571431, 0.11814285714285715, 0.12407142857142857, 0.11207142857142856, 0.11821428571428572, 0.10807142857142855, 0.09635714285714285, 0.10700000000000001, 0.12285714285714286], 'coverages_features': [[0.10000000000000002, 0.20000000000000004, 0.26666666666666666, 0.4666666666666666, 0.7666666666666666, 0.8666666666666667, 1.0], [0.10000000000000002, 0.20000000000000004, 0.3666666666666667, 0.5666666666666668, 0.6, 0.8333333333333334, 1.0], [0.10000000000000002, 0.16666666666666666, 0.26666666666666666, 0.4666666666666666, 0.6999999999999998, 0.8666666666666667, 1.0], [0.10000000000000002, 0.20000000000000004, 0.3, 0.6, 0.7333333333333333, 0.9333333333333332, 1.0], [0.10000000000000002, 0.20000000000000004, 0.3, 0.5, 0.6666666666666666, 0.7666666666666666, 1.0], [0.10000000000000002, 0.20000000000000004, 0.3333333333333333, 0.5333333333333333, 0.7666666666666666, 0.8333333333333334, 1.0], [0.10000000000000002, 0.20000000000000004, 0.3, 0.5333333333333333, 0.6999999999999998, 0.7666666666666666, 0.9], [0.10000000000000002, 0.20000000000000004, 0.2333333333333333, 0.4666666666666666, 0.5333333333333333, 0.6333333333333333, 0.9], [0.10000000000000002, 0.16666666666666666, 0.26666666666666666, 0.4666666666666666, 0.5666666666666667, 0.8000000000000002, 0.9], [0.10000000000000002, 0.16666666666666666, 0.30000000000000004, 0.5666666666666667, 0.7999999999999999, 0.9, 1.0]]}
```
### Inputs
- **predictions**: (list of list of list of float or numpy.ndarray): The generated time-series. The shape of the array should be `(num_generation, seq_len, num_features)`.
- **references**: (list of list of list of float or numpy.ndarray): The original time-series. The shape of the array should be `(num_reference, seq_len, num_features)`.
- **batch_size**: (int, optional): The batch size for computing the metric. This affects quadratically. Default is None.
- **cuc_n_calculation**: (int, optional): The number of samples to compute the coverage because sampling exists. Default is 3.
- **cuc_n_samples**: (list of int, optional): The number of samples to compute the coverage. Default is $[2^i \text{for} i \leq \log_2 n] + [n]$.
- **metric**: (str, optional): The metric to measure distance between examples. Default is "mse". Available options are "mse", "mae", "rmse".
- **num_processes**: (int, optional): The number of processes to use for computing the distance. Default is 1.
- **instance_normalization**: (bool, optional): Whether to normalize the instances along the time axis. Default is False.
- **return_distance**: (bool, optional): Whether to return the distance matrix. Default is False.
- **return_matching**: (bool, optional): Whether to return the matching matrix. Default is False.
- **return_each_features**: (bool, optional): Whether to return the results for each feature. Default is False.
- **return_coverages**: (bool, optional): Whether to return the coverages. Default is False.
- **return_all**: (bool, optional): Whether to return all the results. Default is False.
- **dtype**: (str, optional): The data type used for computation. Default is "float32".
- **eps**: (float, optional): The epsilon value to avoid division by zero. Default is 1e-8.
### Output Values
Let prediction instances be $P = \{p_1, p_2, \ldots, p_n\}$ and reference instances be $R = \{r_1, r_2, \ldots, r_m\}$.
- **precision_distance**: (float): Average of the distance between the generated instance and the reference instance with the lowest distance. Intuitively, this is similar to precision in classification. In the equation, $\frac{1}{n} \sum_{i=1}^{n} \min_{j} \mathrm{distance}(p_i, r_j)$.
- **recall_distance**: (float): Average of the distance between the reference instance and the with the lowest distance. Intuitively, this is similar to recall in classification. In the equation, $\frac{1}{m} \sum_{j=1}^{m} \min_{i} \mathrm{distance}(p_i, r_j)$.
- **mean_disntance**: (float): Average of the precision_distance and recall_distance.
- **index_distance**: (float): Average of the distance between the generated instance and the reference instance with the same index. In the equation, $\frac{1}{n} \sum_{i=1}^{n} \mathrm{distance}(p_i, r_i)$.
- **matching_precision**: (float): Precision of the matching instances, which means how predictions are covered by references, i.e., how accurate the predictions are. In the equation, $\frac{ | \{i | \argmin_{i} \mathrm{distance}(p_i, r_j)\} | }{n}$.
- **matching_recall**: (float): Recall of the matching instances, which means how predictions cover references. In the equation, $\frac{ | \{j | \argmin_{j} \mathrm{distance}(p_i, r_j)\} | }{m}$.
- **matching_f1**: (float): F1-score of the matching instances, harmonic mean of the matching_precision and matching_recall.
- **coverages**: (list of float): Coverage of the matching instances computed on the sampled generated data in cuc_n_samples. In the equation, $[\frac{1}{m} | \{ j \mid \argmin_{j} \mathrm{distance}(p_i, r_j)~\text{where $p_i \in \mathrm{sample}(P, \mathrm{n\_sample})$} \} | ~\text{for}~\mathrm{n\_sample} \in \mathrm{cuc\_n\_samples} ]$.
- **cuc**: (float): Under the curve of the coverage. In the equation, $\int_{0}^{n} \mathrm{coverage}(x) dx$. As an approximation, the trapezoidal rule is used.
- **.\*_features**: (list of float): The values computed individually for each feature.
- **macro_.\***: (float): Averaged values computed for each feature, average of the \*\_features.
- **distance**: (numpy.ndarray): The distance matrix between the generated instances and the reference instances.
- **match**: (numpy.ndarray): The matching matrix between the generated instances and the reference instances.
- **match_inv**: (numpy.ndarray): The matching matrix between the reference instances and the generated instances.
<!-- #### Values from Popular Papers -->
<!-- *Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.* -->
<!-- ### Examples -->
<!-- *Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.* -->
## Limitations and Bias
This metric is based on the assumption that the generated time-series should match the original time-series. This may not be the case in some scenarios. The metric may not be suitable for evaluating time-series generation models that are not required to match the original time-series.
<!-- ## Citation -->
<!-- *Cite the source where this metric was introduced.* -->
<!-- ## Further References -->
<!-- *Add any useful further references.* --> |