vishwa27's picture
End of training
647fb5d
metadata
license: apache-2.0
base_model: google/flan-t5-large
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-large-mawpnli-calcx-nli-pt
    results: []

flan-t5-large-mawpnli-calcx-nli-pt

This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1217
  • Rouge1: 95.7098
  • Rouge2: 89.9271
  • Rougel: 95.5836
  • Rougelsum: 95.5842
  • Gen Len: 10.9151

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.2279 1.0 819 0.1290 95.075 87.8764 94.7902 94.8057 10.7978
0.0612 2.0 1638 0.1012 95.6219 89.6809 95.4399 95.4521 10.9029
0.0418 3.0 2457 0.0972 95.7709 90.1703 95.613 95.637 10.9328
0.0272 4.0 3276 0.1174 95.7478 90.1332 95.5931 95.6069 10.9395
0.0215 5.0 4095 0.1217 95.7098 89.9271 95.5836 95.5842 10.9151

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.12.1+cu113
  • Datasets 2.15.0
  • Tokenizers 0.15.0