Spaces:

DarrenChensformer
/

action_generation

Running

App Files Files Community

action_generation / README.md

DarrenChensformer

Add metric description

07926c5 8 months ago

preview code

raw

history blame contribute delete

2.54 kB

	---
	title: action_generation
	datasets:
	- none
	tags:
	- evaluate
	- metric
	description: "TODO: add a description here"
	sdk: gradio
	sdk_version: 3.19.1
	app_file: app.py
	pinned: false
	---

	# Metric Card for action_generation

	## Metric Description
	Evaluate the result of action generation task.
	Consider the output format `/class/phrase`. Compute the scores for both `/class` and `phrase` separately, and then perform a weighted sum of these scores.

	## How to Use
	```python
	import evaluate
	valid_labels = [
	"/開箱",
	"/教學",
	"/表達",
	"/分享/外部資訊",
	"/分享/個人資訊",
	"/推薦/產品",
	"/推薦/服務",
	"/推薦/其他",
	""
	]
	predictions = [
	["/開箱/xxx", "/教學/yyy", "/表達/zzz"],
	["/分享/外部資訊/aaa", "/教學/yyy", "/表達/zzz", "/分享/個人資訊/bbb"]
	]
	references = [
	["/開箱/xxx", "/教學/yyy", "/表達/zzz"],
	["/推薦/產品/bbb", "/教學/yyy", "/表達/zzz"]
	]
	metric = evaluate.load("DarrenChensformer/action_generation")
	result = metric.compute(predictions=predictions, references=references, valid_labels=valid_labels, detailed_scores=True)
	print(result)
	```

	```
	{'class': {'precision': 0.7143, 'recall': 0.8333, 'f1': 0.7692}, 'phrase': {'precision': 0.8571, 'recall': 1.0, 'f1': 0.9231}, 'weighted_sum': {'precision': 0.7429, 'recall': 0.8666, 'f1': 0.8}}
	```

	### Inputs
	List all input arguments in the format below
	- input_field (type): Definition of input, with explanation if necessary. State any default value(s).

	### Output Values

	Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}

	State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."


	### Examples
	Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.

	## Limitations and Bias
	Note any known limitations or biases that the metric has, with links and references if possible.

	## Citation
	Cite the source where this metric was introduced.

	## Further References
	Add any useful further references.