phi-2-coedit / README.md
iliazlobin's picture
complete: train_size: {train_size}, batch_size: {batch_size}, per_epoch_steps: {per_epoch_steps}, epochs: {epochs}, epoch_total_steps: {epoch_total_steps}
f21cbc1 verified
metadata
license: mit
base_model: microsoft/phi-2
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: phi-2-coedit
    results: []

phi-2-coedit

This model is a fine-tuned version of microsoft/phi-2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7388
  • Rouge1: 0.5206
  • Rouge2: 0.4123
  • Rougel: 0.4979
  • Rougelsum: 0.5032
  • Sacreblue: 28.1346
  • Memory Used: 81917.5
  • Cuda Allocated: 10795.7861
  • Cuda Reserved: 74746.0
  • Ram Usage: 24042.6719
  • Em: 0.0
  • Gen Len: 120.6545

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 35
  • eval_batch_size: 35
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 140
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Sacreblue Memory Used Cuda Allocated Cuda Reserved Ram Usage Em Gen Len
0.5716 0.22 100 0.7558 0.5041 0.3927 0.4809 0.4853 26.9798 81917.5 10795.811 74738.0 22888.4102 0.0 120.3347
0.5407 0.44 200 0.7404 0.5241 0.4171 0.5013 0.5068 27.6806 81917.5 10795.814 74738.0 23733.9805 0.0 120.8277
0.5324 0.66 300 0.7230 0.5176 0.4093 0.4947 0.5002 27.5145 81917.5 10795.8184 74738.0 23831.1484 0.0 120.576
0.5107 0.88 400 0.7161 0.5256 0.4167 0.5042 0.5092 28.1274 81917.5 10795.7935 74738.0 23891.7891 0.0 120.5225
0.4374 1.1 500 0.7495 0.5237 0.414 0.501 0.5059 28.0405 81917.5 10795.7861 74746.0 23922.043 0.0 120.3181
0.3515 1.32 600 0.7418 0.5216 0.4133 0.499 0.5049 28.0528 81917.5 10795.7832 74746.0 23973.8164 0.0 120.6453
0.3449 1.54 700 0.7386 0.5242 0.4163 0.5016 0.5075 28.3145 81917.5 10795.8066 74746.0 23950.1016 0.0 120.5367
0.3375 1.76 800 0.7354 0.5194 0.4124 0.4973 0.5025 28.0252 81917.5 10795.814 74746.0 23931.0 0.0 120.6476
0.3373 1.98 900 0.7388 0.5206 0.4123 0.4979 0.5032 28.1346 81917.5 10795.7861 74746.0 24042.6719 0.0 120.6545

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2