bert-small2bert-small-finetuned-cnn_daily_mail-summarization-newsroom-filtered

This model is a fine-tuned version of mrm8488/bert-small2bert-small-finetuned-cnn_daily_mail-summarization on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5413
  • Rouge1: 32.3232
  • Rouge2: 20.9203
  • Rougel: 27.232
  • Rougelsum: 29.345
  • Gen Len: 72.2217

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 5
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.796 0.89 405 3.6945 29.7168 17.6705 24.4204 26.484 69.6847
3.6426 1.78 810 3.5532 32.3051 20.8789 27.1724 29.384 72.3695
3.2645 2.66 1215 3.5437 32.2016 20.758 27.083 29.0954 73.3892
3.1719 3.55 1620 3.5377 32.5493 21.083 27.0881 29.4691 71.5222
2.9763 4.44 2025 3.5413 32.3232 20.9203 27.232 29.345 72.2217

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.11.0
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.