metadata
license: apache-2.0
tags:
- generated_from_trainer
- summarization
metrics:
- rouge
datasets:
- stacked-summaries/stacked-samsum-1024
model-index:
- name: flan-t5-large-stacked-samsum1024-WIP3
results:
- task:
type: summarization
name: Summarization
dataset:
name: samsum
type: samsum
config: samsum
split: test
metrics:
- name: ROUGE-1
type: rouge
value: 47.6682
verified: true
- name: ROUGE-2
type: rouge
value: 23.3053
verified: true
- name: ROUGE-L
type: rouge
value: 39.7678
verified: true
- name: ROUGE-LSUM
type: rouge
value: 43.259
verified: true
- name: loss
type: loss
value: 2.372586965560913
verified: true
- name: gen_len
type: gen_len
value: 17.4237
verified: true
flan-t5-large-stacked-samsum1024-WIP3
This model is a fine-tuned version of google/flan-t5-large on the stacked-summaries/stacked-samsum-1024
dataset.
It achieves the following results on the evaluation set:
- Loss: 2.1311
- Rouge1: 58.1114
- Rouge2: 29.339
- Rougel: 44.7611
- Rougelsum: 54.2823
- Gen Len: 122.364
Model description
More information needed
Intended uses & limitations
- max input/output is 1024 tokens
- this is mostly a test because
samsum
is not exactly the best dataset for general purpose summarization
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0006
- train_batch_size: 4
- eval_batch_size: 2
- seed: 2760
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 32
- total_train_batch_size: 256
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.02
- num_epochs: 2.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.1734 | 1.0 | 115 | 1.8751 | 57.9286 | 29.2743 | 44.7181 | 54.2295 | 122.123 |
0.1098 | 2.0 | 230 | 2.1311 | 58.1114 | 29.339 | 44.7611 | 54.2823 | 122.364 |
Framework versions
- Transformers 4.26.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.6.1
- Tokenizers 0.13.1