Fix inference example code

6ef3fdb over 1 year ago

4.59 kB

	---
	tags:
	- generated_from_trainer
	- distilbart
	model-index:
	- name: distilbart-finetuned-summarization
	results: []
	license: apache-2.0
	datasets:
	- cnn_dailymail
	- xsum
	- samsum
	- ccdv/pubmed-summarization
	language:
	- en
	metrics:
	- rouge
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# distilgpt2-finetuned-finance

	This model is a further fine-tuned version of [distilbart-cnn-12-6](https://huggingface.co/sshleifer/distilbart-cnn-12-6) on the the combination of 4 different summarisation datasets:
	- [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail)
	- [samsum](https://huggingface.co/datasets/samsum)
	- [xsum](https://huggingface.co/datasets/xsum)
	- [ccdv/pubmed-summarization](https://huggingface.co/datasets/ccdv/pubmed-summarization)

	Please check out the offical model page and paper:
	- [sshleifer/distilbart-cnn-12-6](https://huggingface.co/sshleifer/distilbart-cnn-12-6)
	- [Pre-trained Summarization Distillation](https://arxiv.org/abs/2010.13002)

	## Training and evaluation data

	One can reproduce the dataset using the following code:

	```python
	from datasets import DatasetDict, load_dataset
	from datasets import concatenate_datasets

	xsum_dataset = load_dataset("xsum")
	pubmed_dataset = load_dataset("ccdv/pubmed-summarization").rename_column("article", "document").rename_column("abstract", "summary")
	cnn_dataset = load_dataset("cnn_dailymail", '3.0.0').rename_column("article", "document").rename_column("highlights", "summary")
	samsum_dataset = load_dataset("samsum").rename_column("dialogue", "document")

	summary_train = concatenate_datasets([xsum_dataset["train"], pubmed_dataset["train"], cnn_dataset["train"], samsum_dataset["train"]])
	summary_validation = concatenate_datasets([xsum_dataset["validation"], pubmed_dataset["validation"], cnn_dataset["validation"], samsum_dataset["validation"]])
	summary_test = concatenate_datasets([xsum_dataset["test"], pubmed_dataset["test"], cnn_dataset["test"], samsum_dataset["test"]])

	raw_datasets = DatasetDict()
	raw_datasets["train"] = summary_train
	raw_datasets["validation"] = summary_validation
	raw_datasets["test"] = summary_test

	```

	## Inference example

	```python
	from transformers import pipeline

	pipe = pipeline("text2text-generation", model="lxyuan/distilbart-finetuned-summarization")

	text = """The tower is 324 metres (1,063 ft) tall, about the same height as
	an 81-storey building, and the tallest structure in Paris. Its base is square,
	measuring 125 metres (410 ft) on each side. During its construction, the
	Eiffel Tower surpassed the Washington Monument to become the tallest man-made
	structure in the world, a title it held for 41 years until the Chrysler Building
	in New York City was finished in 1930. It was the first structure to reach a
	height of 300 metres. Due to the addition of a broadcasting aerial at the top
	of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres
	(17 ft). Excluding transmitters, the Eiffel Tower is the second tallest
	free-standing structure in France after the Millau Viaduct.
	"""

	pipe(text)

	>>>"""The Eiffel Tower is the tallest man-made structure in the world .
	The tower is 324 metres tall, about the same height as an 81-storey building .
	Due to the addition of a broadcasting aerial in 1957, it is now taller than
	the Chrysler Building by 5.2 metres .
	"""
	```

	## Training procedure

	Notebook link: [here](https://github.com/LxYuan0420/nlp/blob/main/notebooks/distilbart-finetune-summarisation.ipynb)

	### Training hyperparameters

	The following hyperparameters were used during training:
	- evaluation_strategy="epoch",
	- save_strategy="epoch",
	- logging_strategy="epoch",
	- learning_rate=2e-5,
	- per_device_train_batch_size=2,
	- per_device_eval_batch_size=2,
	- gradient_accumulation_steps=64,
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- weight_decay=0.01,
	- save_total_limit=2,
	- num_train_epochs=10,
	- predict_with_generate=True,
	- fp16=True,
	- push_to_hub=True

	### Training results
	_Training is still in progress_

	\| Epoch \| Training Loss \| Validation Loss \| Rouge1 \| Rouge2 \| RougeL \| RougeLsum \| Gen Len \|
	\|-------\|---------------\|-----------------\|--------\|--------\|--------\|-----------\|---------\|
	\| 0 \| 1.779700 \| 1.719054 \| 40.0039\| 17.9071\| 27.8825\| 34.8886 \| 88.8936 \|

	### Framework versions

	- Transformers 4.30.2
	- Pytorch 2.0.1+cu117
	- Datasets 2.13.1
	- Tokenizers 0.13.3