Update README.md

c71f315 verified 8 days ago

6.08 kB

	---
	language:
	- en
	license: mit
	tags:
	- summarization
	- t5-large-summarization
	- pipeline:summarization
	thumbnail: https://huggingface.co/front/thumbnails/facebook.png
	model-index:
	- name: sysresearch101/t5-large-finetuned-xsum-cnn
	results:
	- task:
	type: summarization
	name: Summarization
	dataset:
	name: xsum
	type: xsum
	config: default
	split: test
	metrics:
	- type: rouge
	value: 36.7656
	name: ROUGE-1
	verified: true
	verifyToken: >-
	eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiN2QzMDg4NTM0ZTc5MjAzNTY4MmY1YTRiMWI3M2I2NDdjMTM4ZGNhYzZhOWQzMWI0MjJlYmU3MTg0ZjVjMTEyZSIsInZlcnNpb24iOjF9.AuKHql0LQs0zDQNn7zvySnX50GAC8jEWyYz-LtBgWj0dcad86J8yfHbIDswmgx2ur0S3yttw72qNExag_Fw7Dw
	- type: rouge
	value: 14.6898
	name: ROUGE-2
	verified: true
	verifyToken: >-
	eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTE3ZTExY2M3MTIwMWY0ODRkZDI1YjU2ZjRkOGJjOGQyYjcxMTMxOWExN2Q0OGNkZmNiYzYzYzVhODY4YzEwOSIsInZlcnNpb24iOjF9.F1Q17sa8IAsW8ouQ2VDLq_VvHDxjuMjVU3rMfvkbmKxAjTDKVTiaG6Eg9uSKIYzgJoDSsxhsZcjH-J0gGQv3Dg
	- type: rouge
	value: 30.0646
	name: ROUGE-L
	verified: true
	verifyToken: >-
	eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYzI1NjE0NmI5Nzc3ODFiNDI5YzVhNjUzNzU1NzA0ZDMwMjFjZDE1YzUxNjZmZTAwZTM0MmVmN2ZkYWUwMjBiZSIsInZlcnNpb24iOjF9.xehN8zOV6050WvoLZIJ-l2zB93jWY_ugcydDDqV06XwdKwZ7l0TI8BoLDOO7Mw7dRmHOWLNruDJZnOnW3_3pCQ
	- type: rouge
	value: 30.0563
	name: ROUGE-LSUM
	verified: true
	verifyToken: >-
	eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmU0OTVhYTY0ZDJmOTU3OWE5MzgxYzdhNmQ3MjM3YzM2MGIzOGViY2ZkMTI1ZWI4NDMwOTlkODBjOGE4NTE4ZCIsInZlcnNpb24iOjF9.FtNN06HKSgEB1tiWpToEVnNfzhQs9ZR59386YynOY6T6oKWxbIiRyItzYXobNw96lg5c2sE4vdJSfdtbBpkyDA
	- type: loss
	value: 1.6373405456542969
	name: loss
	verified: true
	verifyToken: >-
	eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYTVjYzI0MmMyY2IzYTE0NDUxY2FiMDM4Mjk2NTI1NTk0NjFiYTY2OWMxODRjNWJhYjU4ZWU5OTk4Y2E5N2RkOSIsInZlcnNpb24iOjF9.Cz5AQ-B8IAXmf1Xc_7UJ0pI9XKYHxDEwmoP3ZFsS2Wmbk1pUB8o_Y8AErBR8-Q60qR_ndw8eSwrI0EnPohYHCw
	- type: gen_len
	value: 18.6054
	name: gen_len
	verified: true
	verifyToken: >-
	eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWRlMjM5MzAyMjEzYzdkODFmNDk4NDg5NWM4NWIxMTU4YWMxNzZjMGFjOWJiMDdkMjQyMTY0ZGFmYzA2OTA0YiIsInZlcnNpb24iOjF9.IFiGJEsyD7Uhj8bo9SsAgibk9qCXZH6IWaLKULLxBz5N8WXF2vc2Mfg5OThEzdrydPhJInRgp0jd8m-kF5nNCA
	datasets:
	- abisee/cnn_dailymail
	- EdinburghNLP/xsum
	base_model:
	- google-t5/t5-large
	---

	# T5-Large Fine-tuned on the combined XSum + CNN/DailyMail Datasets

	Task: Abstractive Summarization (English)
	Base Model: google-t5/t5-large
	License: MIT

	## Overview

	This model is a T5-Large checkpoint fine-tuned jointly on [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum) and [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail) datasets. It produces concise, abstractive summaries and has been widely adopted as a baseline in summarization research.

	## Performance ~ On XSum test set

	\| Metric \| Score \|
	\|--------\|-------\|
	\| ROUGE-1 \| 36.77 \|
	\| ROUGE-2 \| 14.69 \|
	\| ROUGE-L \| 30.06 \|
	\| Loss \| 1.64 \|
	\| Avg. Length \| 18.6 tokens \|



	## Usage

	### Quick Start

	```python
	from transformers import pipeline

	summarizer = pipeline("summarization", model="sysresearch101/t5-large-finetuned-xsum-cnn")

	article = "Your article text here..."
	summary = summarizer(article, max_length=80, min_length=20, do_sample=False)
	print(summary[0]['summary_text'])
	```

	### Advanced Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	tokenizer = AutoTokenizer.from_pretrained("sysresearch101/t5-large-finetuned-xsum-cnn")
	model = AutoModelForSeq2SeqLM.from_pretrained("sysresearch101/t5-large-finetuned-xsum-cnn")

	inputs = tokenizer("summarize: " + article, return_tensors="pt", max_length=512, truncation=True)
	outputs = model.generate(
	**inputs,
	max_length=80,
	min_length=20,
	num_beams=4,
	no_repeat_ngram_size=2,
	length_penalty=1.0,
	repetition_penalty=2.5,
	use_cache=True,
	early_stopping=True,
	do_sample = True,
	temperature = 0.8,
	top_k = 50,
	top_p = 0.95
	)

	summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
	```

	## Training Data

	- [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum): BBC articles with single-sentence summaries
	- [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail): News articles with multi-sentence summaries
	-
	## Intended Use
	- Primary: Summarization.
	- Secondary: Educational demonstrations, reproducible baselines, Research benchmarking, academic studies on summarization


	## Limitations
	- Optimized for English news text; performance may vary on other domains
	- Tends to produce very concise summaries (18-20 tokens average)
	- No built-in fact-checking or content filtering


	## Citation

	```bibtex
	@misc{stept2023_t5_large_xsum_cnn_summarization,
	author = {Shlomo Stept (sysresearch101)},
	title = {T5-Large Fine-tuned on XSum + CNN/DailyMail for Abstractive Summarization},
	year = {2023},
	publisher = {Hugging Face},
	url = {https://huggingface.co/sysresearch101/t5-large-finetuned-xsum-cnn}
	}
	```


	## Papers Using This Model
	* [Zhu et al. (2023). Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization. ACL 2023 (Long).](https://aclanthology.org/2023.acl-long.377.pdf)
	* European Food Safety Authority. (2023). Implementing AI Vertical use cases – Scenario 1. EFSA Journal, Special Publication EN-8223. https://doi.org/10.2903/sp.efsa.2023.EN-8223
	* (Forthcoming) Budget-Constrained Learning to Defer for Autoregressive Generation (under review, ICLR 2025)



	## Contact

	Created by [Shlomo Stept](https://shlomostept.com) ([ORCID: 0009-0009-3185-589X](https://orcid.org/0009-0009-3185-589X))
	DARMIS AI

	- Website: [shlomostept.com](https://shlomostept.com)
	- LinkedIn: [linkedin.com/in/shlomo-stept](https://linkedin.com/in/shlomo-stept)