FLAN-T5-small Dialogue Summarization
Model Description
Fine-tuned FLAN-T5-small model for dialogue summarization tasks using the DialogSum dataset. Achieves improved performance in generating concise summaries from conversational dialogues.
Training Data
- Dataset: DialogSum (1,837 annotated dialogues)
- Preprocessing: prompt_template = """ Here is a dialogue: {dialogue} Write a short summary. {summary} """
Converted original dataset into instruction format with dialogue-summary pairs
Training Setup
Parameter | Value |
---|---|
Base Model | google/flan-t5-small |
Epochs | 5 |
Batch Size | 16 (per device) |
Learning Rate | 3e-4 |
Optimizer | Adafactor |
Mixed Precision | fp16 |
Gradient Accumulation | 4 steps |
Max Length | 512 tokens |
Evaluation Results
Metric | Value |
---|---|
ROUGE-1 | 0.3722 |
ROUGE-2 | 0.1066 |
ROUGE-L | 0.2794 |
Basic Inference
from transformers import pipeline
from datasets import load_dataset
from evaluate import load
summarizer = pipeline('summarization', model='ingu627/finetuned-flan-t5-dialogsum')
dataset = load_dataset('knkarthick/dialogsum', split='test')
basic_summarizer = pipeline('summarization', model='google/flan-t5-small')
rouge = load('rouge')
references = []
predictions = []
basic_predictions = []
for example in dataset.select(range(50)):
generated = summarizer(
f"Summarize this dialogue:\n{example['dialogue']}\nSummary:",
max_length=135,
num_beams=3
)[0]['summary_text']
basic_generated = basic_summarizer(
f"Summarize this dialogue:\n{example['dialogue']}\nSummary:",
max_length=135,
num_beams=3
)[0]['summary_text']
references.append(example['summary'])
predictions.append(generated)
basic_predictions.append(basic_generated)
fine_tuned_results = rouge.compute(
predictions=predictions,
references=references,
rouge_types=['rouge1', 'rouge2', 'rougeL'],
use_aggregator=True,
use_stemmer=True,
)
print(fine_tuned_results)
Training Procedure
- Hardware: T4 GPU on Kaggle
- Framework: PyTorch with Hugging Face Transformers
- Training Time: ~50 minutes (Kaggle free tier)
Recommendations
- Use beam search (num_beams=3-5) for better results
- Combine with post-processing for formatting
- Fine-tune longer for complex dialogues
Limitations
- Struggles with multi-topic dialogues
- May miss subtle contextual cues
- Best performance on short conversations (<500 tokens)
License
Apache 2.0 (Same as base FLAN-T5 model)
Citation
@misc{dialogsum2021, title={DialogSum: A Real-Life Scenario Dialogue Summarization Dataset}, author={Karthick Krishnamurthy}, year={2021}, howpublished={HuggingFace Datasets}, }
- Downloads last month
- 7
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for ingu627/finetuned-flan-t5-dialogsum
Base model
google/flan-t5-small