sysresearch101's picture
Update README.md
c71f315 verified
---
language:
- en
license: mit
tags:
- summarization
- t5-large-summarization
- pipeline:summarization
thumbnail: https://huggingface.co/front/thumbnails/facebook.png
model-index:
- name: sysresearch101/t5-large-finetuned-xsum-cnn
results:
- task:
type: summarization
name: Summarization
dataset:
name: xsum
type: xsum
config: default
split: test
metrics:
- type: rouge
value: 36.7656
name: ROUGE-1
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiN2QzMDg4NTM0ZTc5MjAzNTY4MmY1YTRiMWI3M2I2NDdjMTM4ZGNhYzZhOWQzMWI0MjJlYmU3MTg0ZjVjMTEyZSIsInZlcnNpb24iOjF9.AuKHql0LQs0zDQNn7zvySnX50GAC8jEWyYz-LtBgWj0dcad86J8yfHbIDswmgx2ur0S3yttw72qNExag_Fw7Dw
- type: rouge
value: 14.6898
name: ROUGE-2
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTE3ZTExY2M3MTIwMWY0ODRkZDI1YjU2ZjRkOGJjOGQyYjcxMTMxOWExN2Q0OGNkZmNiYzYzYzVhODY4YzEwOSIsInZlcnNpb24iOjF9.F1Q17sa8IAsW8ouQ2VDLq_VvHDxjuMjVU3rMfvkbmKxAjTDKVTiaG6Eg9uSKIYzgJoDSsxhsZcjH-J0gGQv3Dg
- type: rouge
value: 30.0646
name: ROUGE-L
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYzI1NjE0NmI5Nzc3ODFiNDI5YzVhNjUzNzU1NzA0ZDMwMjFjZDE1YzUxNjZmZTAwZTM0MmVmN2ZkYWUwMjBiZSIsInZlcnNpb24iOjF9.xehN8zOV6050WvoLZIJ-l2zB93jWY_ugcydDDqV06XwdKwZ7l0TI8BoLDOO7Mw7dRmHOWLNruDJZnOnW3_3pCQ
- type: rouge
value: 30.0563
name: ROUGE-LSUM
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmU0OTVhYTY0ZDJmOTU3OWE5MzgxYzdhNmQ3MjM3YzM2MGIzOGViY2ZkMTI1ZWI4NDMwOTlkODBjOGE4NTE4ZCIsInZlcnNpb24iOjF9.FtNN06HKSgEB1tiWpToEVnNfzhQs9ZR59386YynOY6T6oKWxbIiRyItzYXobNw96lg5c2sE4vdJSfdtbBpkyDA
- type: loss
value: 1.6373405456542969
name: loss
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYTVjYzI0MmMyY2IzYTE0NDUxY2FiMDM4Mjk2NTI1NTk0NjFiYTY2OWMxODRjNWJhYjU4ZWU5OTk4Y2E5N2RkOSIsInZlcnNpb24iOjF9.Cz5AQ-B8IAXmf1Xc_7UJ0pI9XKYHxDEwmoP3ZFsS2Wmbk1pUB8o_Y8AErBR8-Q60qR_ndw8eSwrI0EnPohYHCw
- type: gen_len
value: 18.6054
name: gen_len
verified: true
verifyToken: >-
eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWRlMjM5MzAyMjEzYzdkODFmNDk4NDg5NWM4NWIxMTU4YWMxNzZjMGFjOWJiMDdkMjQyMTY0ZGFmYzA2OTA0YiIsInZlcnNpb24iOjF9.IFiGJEsyD7Uhj8bo9SsAgibk9qCXZH6IWaLKULLxBz5N8WXF2vc2Mfg5OThEzdrydPhJInRgp0jd8m-kF5nNCA
datasets:
- abisee/cnn_dailymail
- EdinburghNLP/xsum
base_model:
- google-t5/t5-large
---
# T5-Large Fine-tuned on the combined XSum + CNN/DailyMail Datasets
**Task:** Abstractive Summarization (English)
**Base Model:** google-t5/t5-large
**License:** MIT
## Overview
This model is a T5-Large checkpoint fine-tuned jointly on [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum) and [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail) datasets. It produces concise, abstractive summaries and has been widely adopted as a baseline in summarization research.
## Performance ~ On XSum test set
| Metric | Score |
|--------|-------|
| ROUGE-1 | 36.77 |
| ROUGE-2 | 14.69 |
| ROUGE-L | 30.06 |
| Loss | 1.64 |
| Avg. Length | 18.6 tokens |
## Usage
### Quick Start
```python
from transformers import pipeline
summarizer = pipeline("summarization", model="sysresearch101/t5-large-finetuned-xsum-cnn")
article = "Your article text here..."
summary = summarizer(article, max_length=80, min_length=20, do_sample=False)
print(summary[0]['summary_text'])
```
### Advanced Usage
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("sysresearch101/t5-large-finetuned-xsum-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("sysresearch101/t5-large-finetuned-xsum-cnn")
inputs = tokenizer("summarize: " + article, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(
**inputs,
max_length=80,
min_length=20,
num_beams=4,
no_repeat_ngram_size=2,
length_penalty=1.0,
repetition_penalty=2.5,
use_cache=True,
early_stopping=True,
do_sample = True,
temperature = 0.8,
top_k = 50,
top_p = 0.95
)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Training Data
- [XSum](https://huggingface.co/datasets/EdinburghNLP/xsum): BBC articles with single-sentence summaries
- [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail): News articles with multi-sentence summaries
-
## Intended Use
- **Primary:** Summarization.
- **Secondary:** Educational demonstrations, reproducible baselines, Research benchmarking, academic studies on summarization
## Limitations
- Optimized for English news text; performance may vary on other domains
- Tends to produce very concise summaries (18-20 tokens average)
- No built-in fact-checking or content filtering
## Citation
```bibtex
@misc{stept2023_t5_large_xsum_cnn_summarization,
author = {Shlomo Stept (sysresearch101)},
title = {T5-Large Fine-tuned on XSum + CNN/DailyMail for Abstractive Summarization},
year = {2023},
publisher = {Hugging Face},
url = {https://huggingface.co/sysresearch101/t5-large-finetuned-xsum-cnn}
}
```
## Papers Using This Model
* [Zhu et al. (2023). *Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization.* ACL 2023 (Long).](https://aclanthology.org/2023.acl-long.377.pdf)
* European Food Safety Authority. (2023). Implementing AI Vertical use cases – Scenario 1. EFSA Journal, Special Publication EN-8223. https://doi.org/10.2903/sp.efsa.2023.EN-8223
* *(Forthcoming)* Budget-Constrained Learning to Defer for Autoregressive Generation (under review, ICLR 2025)
## Contact
Created by [Shlomo Stept](https://shlomostept.com) ([ORCID: 0009-0009-3185-589X](https://orcid.org/0009-0009-3185-589X))
DARMIS AI
- Website: [shlomostept.com](https://shlomostept.com)
- LinkedIn: [linkedin.com/in/shlomo-stept](https://linkedin.com/in/shlomo-stept)