|
--- |
|
title: RougeRaw |
|
emoji: π€ |
|
colorFrom: blue |
|
colorTo: red |
|
sdk: gradio |
|
sdk_version: 5.4.0 |
|
app_file: app.py |
|
pinned: false |
|
tags: |
|
- evaluate |
|
- metric |
|
description: >- |
|
ROUGE RAW is language-agnostic variant of ROUGE without stemmer, stop words |
|
and synonymas. This is a wrapper around the original |
|
http://hdl.handle.net/11234/1-2615 script. |
|
--- |
|
|
|
# Metric Card for RougeRaw |
|
|
|
|
|
## Metric Description |
|
|
|
ROUGE RAW is language-agnostic variant of ROUGE without stemmer, stop words and synonymas. |
|
This is a wrapper around the original http://hdl.handle.net/11234/1-2615 script. |
|
|
|
|
|
## How to Use |
|
|
|
|
|
```python |
|
import evaluate |
|
rougeraw = evaluate.load('CZLC/rouge_raw') |
|
predictions = ["the cat is on the mat", "hello there"] |
|
references = ["the cat is on the mat", "hello there"] |
|
results = rougeraw.compute(predictions=predictions, references=references) |
|
print(results) |
|
{'1_low_precision': 1.0, '1_low_recall': 1.0, '1_low_fmeasure': 1.0, '1_mid_precision': 1.0, '1_mid_recall': 1.0, '1_mid_fmeasure': 1.0, '1_high_precision': 1.0, '1_high_recall': 1.0, '1_high_fmeasure': 1.0, '2_low_precision': 1.0, '2_low_recall': 1.0, '2_low_fmeasure': 1.0, '2_mid_precision': 1.0, '2_mid_recall': 1.0, '2_mid_fmeasure': 1.0, '2_high_precision': 1.0, '2_high_recall': 1.0, '2_high_fmeasure': 1.0, 'L_low_precision': 1.0, 'L_low_recall': 1.0, 'L_low_fmeasure': 1.0, 'L_mid_precision': 1.0, 'L_mid_recall': 1.0, 'L_mid_fmeasure': 1.0, 'L_high_precision': 1.0, 'L_high_recall': 1.0, 'L_high_fmeasure': 1.0} |
|
``` |
|
|
|
|
|
### Inputs |
|
predictions: list of predictions to evaluate. Each prediction should be a string with tokens separated by spaces. |
|
references: list of reference for each prediction. Each reference should be a string with tokens separated by space |
|
|
|
### Output Values |
|
This metric outputs a dictionary, containing the scores. |
|
|
|
There are precision, recall, F1 values for rougeraw-1, rougeraw-2 and rougeraw-l. By default the bootstrapped confidence intervals are calculated, meaning that for each metric there are low, mid , high values specifying the confidence interval. |
|
|
|
|
|
Key format: |
|
``` |
|
{1|2|L}_{low|mid|high}_{precision|recall|fmeasure} |
|
e.g.: 1_low_precision |
|
``` |
|
|
|
If aggregate is False the format is: |
|
``` |
|
{1|2|L}_{precision|recall|fmeasure} |
|
e.g.: 1_precision |
|
``` |
|
|
|
## Citation(s) |
|
```bibtex |
|
@inproceedings{straka-etal-2018-sumeczech, |
|
title = "{S}ume{C}zech: Large {C}zech News-Based Summarization Dataset", |
|
author = "Straka, Milan and |
|
Mediankin, Nikita and |
|
Kocmi, Tom and |
|
{\v{Z}}abokrtsk{\'y}, Zden{\v{e}}k and |
|
Hude{\v{c}}ek, Vojt{\v{e}}ch and |
|
Haji{\v{c}}, Jan", |
|
editor = "Calzolari, Nicoletta and |
|
Choukri, Khalid and |
|
Cieri, Christopher and |
|
Declerck, Thierry and |
|
Goggi, Sara and |
|
Hasida, Koiti and |
|
Isahara, Hitoshi and |
|
Maegaard, Bente and |
|
Mariani, Joseph and |
|
Mazo, H{\'e}l{\`e}ne and |
|
Moreno, Asuncion and |
|
Odijk, Jan and |
|
Piperidis, Stelios and |
|
Tokunaga, Takenobu", |
|
booktitle = "Proceedings of the Eleventh International Conference on Language Resources and Evaluation ({LREC} 2018)", |
|
month = may, |
|
year = "2018", |
|
address = "Miyazaki, Japan", |
|
publisher = "European Language Resources Association (ELRA)", |
|
url = "https://aclanthology.org/L18-1551", |
|
} |
|
``` |