Model Name: LoRA Fine-Tuned Model for Dialogue Summarization
Model Type: Seq2Seq with Low-Rank Adaptation (LoRA)
Base Model: google/t5-base

Model Details

  • Architecture: T5-base
  • Finetuning Technique: LoRA (Low-Rank Adaptation)
  • PEFT Method: Parameter Efficient Fine-Tuning
  • Data: samsumdataset
  • Metrics: Evaluated using ROUGE (ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum)

Intended Use

This model is designed for summarizing dialogues, such as conversations between individuals in a chat or messaging context. It’s suitable for applications in:

  • Customer Service: Summarizing chat logs for quality monitoring or training.
  • Messaging Apps: Generating conversation summaries for user convenience.
  • Content Creation: Assisting writers by summarizing character dialogues.

Training Process

Optimizer: AdamW with learning rate 3e-5

Batch Size: 4 (gradient accumulation steps of 2)

Training Epochs: 2

Evaluation Metrics: ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum

Hardware: Trained on a single GPU with mixed precision to optimize performance.

The model was trained using the Seq2SeqTrainer class from transformers, with LoRA parameters applied to selected attention layers to reduce computation without compromising accuracy.

Downloads last month
20
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for dnzblgn/Chat-Summarization

Base model

google-t5/t5-base
Finetuned
(445)
this model