lxyuan
/

distilbart-finetuned-summarization

@@ -1,123 +1,48 @@
 ---
 tags:
 - generated_from_trainer
-- distilbart
 model-index:
 - name: distilbart-finetuned-summarization
   results: []
-license: apache-2.0
-datasets:
-- cnn_dailymail
-- xsum
-- samsum
-- ccdv/pubmed-summarization
-language:
-- en
-metrics:
-- rouge
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# distilgpt2-finetuned-finance
-This model is a further fine-tuned version of [distilbart-cnn-12-6](https://huggingface.co/sshleifer/distilbart-cnn-12-6) on the the combination of 4 different summarisation datasets:
-- [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail)
-- [samsum](https://huggingface.co/datasets/samsum)
-- [xsum](https://huggingface.co/datasets/xsum)
-- [ccdv/pubmed-summarization](https://huggingface.co/datasets/ccdv/pubmed-summarization)
-Please check out the offical model page and paper:
-- [sshleifer/distilbart-cnn-12-6](https://huggingface.co/sshleifer/distilbart-cnn-12-6)
-- [Pre-trained Summarization Distillation](https://arxiv.org/abs/2010.13002)
-## Training and evaluation data
-One can reproduce the dataset using the following code:
-```python
-from datasets import DatasetDict, load_dataset
-from datasets import concatenate_datasets
-xsum_dataset = load_dataset("xsum")
-pubmed_dataset = load_dataset("ccdv/pubmed-summarization").rename_column("article", "document").rename_column("abstract", "summary")
-cnn_dataset = load_dataset("cnn_dailymail", '3.0.0').rename_column("article", "document").rename_column("highlights", "summary")
-samsum_dataset = load_dataset("samsum").rename_column("dialogue", "document")
-summary_train = concatenate_datasets([xsum_dataset["train"], pubmed_dataset["train"], cnn_dataset["train"], samsum_dataset["train"]])
-summary_validation = concatenate_datasets([xsum_dataset["validation"], pubmed_dataset["validation"], cnn_dataset["validation"], samsum_dataset["validation"]])
-summary_test = concatenate_datasets([xsum_dataset["test"], pubmed_dataset["test"], cnn_dataset["test"], samsum_dataset["test"]])
-raw_datasets = DatasetDict()
-raw_datasets["train"] = summary_train
-raw_datasets["validation"] = summary_validation
-raw_datasets["test"] = summary_test
-```
-## Inference example
-```python
-from transformers import pipeline
-pipe = pipeline("text2text-generation", model="lxyuan/distilbart-finetuned-summarization")
-text = """The tower is 324 metres (1,063 ft) tall, about the same height as
-an 81-storey building, and the tallest structure in Paris. Its base is square,
-measuring 125 metres (410 ft) on each side. During its construction, the
-Eiffel Tower surpassed the Washington Monument to become the tallest man-made
-structure in the world, a title it held for 41 years until the Chrysler Building
-in New York City was finished in 1930. It was the first structure to reach a
-height of 300 metres. Due to the addition of a broadcasting aerial at the top
-of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres
-(17 ft). Excluding transmitters, the Eiffel Tower is the second tallest
-free-standing structure in France after the Millau Viaduct.
-"""
-pipe(text)
->>>"""The Eiffel Tower is the tallest man-made structure in the world .
-The tower is 324 metres tall, about the same height as an 81-storey building .
-Due to the addition of a broadcasting aerial in 1957, it is now taller than
-the Chrysler Building by 5.2 metres .
-"""
-```
 ## Training procedure
-Notebook link: [here](https://github.com/LxYuan0420/nlp/blob/main/notebooks/distilbart-finetune-summarisation.ipynb)
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- evaluation_strategy="epoch",
-- save_strategy="epoch",
-- logging_strategy="epoch",
-- learning_rate=2e-5,
-- per_device_train_batch_size=2,
-- per_device_eval_batch_size=2,
-- gradient_accumulation_steps=64,
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- weight_decay=0.01,
-- save_total_limit=2,
-- num_train_epochs=10,
-- predict_with_generate=True,
-- fp16=True,
-- push_to_hub=True
-### Training results
-_Training is still in progress_
-| Epoch | Training Loss | Validation Loss | Rouge1 | Rouge2 | RougeL | RougeLsum | Gen Len |
-|-------|---------------|-----------------|--------|--------|--------|-----------|---------|
-| 0     | 1.779700      | 1.719054        | 40.0039| 17.9071| 27.8825| 34.8886   | 88.8936 |
 ### Framework versions
 - Transformers 4.30.2
 - Pytorch 2.0.1+cu117
 - Datasets 2.13.1
-- Tokenizers 0.13.3

 ---
 tags:
 - generated_from_trainer
 model-index:
 - name: distilbart-finetuned-summarization
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# distilbart-finetuned-summarization
+This model was trained from scratch on the None dataset.
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- gradient_accumulation_steps: 64
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 5
 ### Framework versions
 - Transformers 4.30.2
 - Pytorch 2.0.1+cu117
 - Datasets 2.13.1
+- Tokenizers 0.13.3