---
title: matching_series
tags:
- evaluate
- metric
description: "Matching-based time-series generation metric"
sdk: gradio
sdk_version: 3.50
app_file: app.py
pinned: false
---

# Metric Card for matching_series

## Metric Description
Matching Series is a metric for evaluating time-series generation models. It is based on the idea of matching the generated time-series with the original time-series. The metric calculates the Mean Squared Error (MSE) between the generated time-series and the original time-series between matched instances. The metric outputs a score greater or equal to 0, where 0 indicates a perfect generation.

## How to Use
At minium, the metric requires the original time-series and the generated time-series as input. The metric can be used to evaluate the performance of time-series generation models.

```python
>>> num_generation = 100
>>> num_reference = 10
>>> seq_len = 100
>>> num_features = 10
>>> references = np.random.rand(num_reference, seq_len, num_features)
>>> predictions = np.random.rand(num_generation, seq_len, num_features)
>>> metric = evaluate.load("bowdbeg/matching_series")
>>> results = metric.compute(references=references, predictions=predictions, batch_size=1000)
>>> print(results)
{'matching_mse': 0.15250070138019745, 'harmonic_mean': 0.15246672297315564, 'covered_mse': 0.15243275970407652, 'index_mse': 0.16772539808686357, 'matching_mse_features': [0.11976368411913872, 0.1238622735860897, 0.1235259257706047, 0.12385236248438022, 0.12241466736218365, 0.12328439290438079, 0.1232240061707885, 0.12342319803028035, 0.12235222572924524, 0.12437865819262514], 'harmonic_mean_features': [0.12010478503934609, 0.12379899085819131, 0.12321441761307182, 0.12273884163905005, 0.12256126537300535, 0.12323289686030311, 0.12323847434641247, 0.12333469339243568, 0.12273530480438972, 0.12390254295493403], 'covered_mse_features': [0.12044783449951382, 0.1237357727610885, 0.12290447662839017, 0.12164516506865233, 0.12270821492248948, 0.12318144381818667, 0.12325294591995689, 0.12324631559392285, 0.12312079021887229, 0.12343005890751833], 'index_mse_features': [0.16331894487549958, 0.1679797859239729, 0.16904075114728268, 0.16962427920551068, 0.16915910655024802, 0.16686197230602684, 0.17056311327206022, 0.1638796919248867, 0.16736730842643857, 0.16945902723670975], 'macro_matching_mse': 0.1230081394349717, 'macro_covered_mse': 0.12276730183385913, 'macro_harmonic_mean': 0.12288622128811397}
```

### Inputs
- **predictions**: (list of list of list of float or numpy.ndarray): The generated time-series. The shape of the array should be `(num_generation, seq_len, num_features)`.
- **references**: (list of list of list of float or numpy.ndarray): The original time-series. The shape of the array should be `(num_reference, seq_len, num_features)`.

### Output Values

*Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}*

*State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."*

#### Values from Popular Papers
*Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.*

### Examples
*Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.*

## Limitations and Bias
*Note any known limitations or biases that the metric has, with links and references if possible.*

## Citation
*Cite the source where this metric was introduced.*

## Further References
*Add any useful further references.*