--- title: matching_series tags: - evaluate - metric description: "Matching-based time-series generation metric" sdk: gradio sdk_version: 3.50 app_file: app.py pinned: false --- # Metric Card for matching_series ## Metric Description Matching Series is a metric for evaluating time-series generation models. It is based on the idea of matching the generated time-series with the original time-series. The metric calculates the Mean Squared Error (MSE) between the generated time-series and the original time-series between matched instances. The metric outputs a score greater or equal to 0, where 0 indicates a perfect generation. ## How to Use At minium, the metric requires the original time-series and the generated time-series as input. The metric can be used to evaluate the performance of time-series generation models. ```python >>> num_generation = 100 >>> num_reference = 10 >>> seq_len = 100 >>> num_features = 10 >>> references = np.random.rand(num_reference, seq_len, num_features) >>> predictions = np.random.rand(num_generation, seq_len, num_features) >>> metric = evaluate.load("bowdbeg/matching_series") >>> results = metric.compute(references=references, predictions=predictions, batch_size=1000) >>> print(results) {'matching_mse': 0.15250070138019745, 'harmonic_mean': 0.15246672297315564, 'covered_mse': 0.15243275970407652, 'index_mse': 0.16772539808686357, 'matching_mse_features': [0.11976368411913872, 0.1238622735860897, 0.1235259257706047, 0.12385236248438022, 0.12241466736218365, 0.12328439290438079, 0.1232240061707885, 0.12342319803028035, 0.12235222572924524, 0.12437865819262514], 'harmonic_mean_features': [0.12010478503934609, 0.12379899085819131, 0.12321441761307182, 0.12273884163905005, 0.12256126537300535, 0.12323289686030311, 0.12323847434641247, 0.12333469339243568, 0.12273530480438972, 0.12390254295493403], 'covered_mse_features': [0.12044783449951382, 0.1237357727610885, 0.12290447662839017, 0.12164516506865233, 0.12270821492248948, 0.12318144381818667, 0.12325294591995689, 0.12324631559392285, 0.12312079021887229, 0.12343005890751833], 'index_mse_features': [0.16331894487549958, 0.1679797859239729, 0.16904075114728268, 0.16962427920551068, 0.16915910655024802, 0.16686197230602684, 0.17056311327206022, 0.1638796919248867, 0.16736730842643857, 0.16945902723670975], 'macro_matching_mse': 0.1230081394349717, 'macro_covered_mse': 0.12276730183385913, 'macro_harmonic_mean': 0.12288622128811397} ``` ### Inputs - **predictions**: (list of list of list of float or numpy.ndarray): The generated time-series. The shape of the array should be `(num_generation, seq_len, num_features)`. - **references**: (list of list of list of float or numpy.ndarray): The original time-series. The shape of the array should be `(num_reference, seq_len, num_features)`. ### Output Values *Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}* *State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."* #### Values from Popular Papers *Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.* ### Examples *Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.* ## Limitations and Bias *Note any known limitations or biases that the metric has, with links and references if possible.* ## Citation *Cite the source where this metric was introduced.* ## Further References *Add any useful further references.*