EricWesthoff's picture
End of training
d61ba60
|
raw
history blame
10.6 kB
metadata
license: other
base_model: microsoft/phi-1_5
tags:
  - generated_from_trainer
model-index:
  - name: phi-1_5-finetuned-SQL
    results: []

phi-1_5-finetuned-SQL

This model is a fine-tuned version of microsoft/phi-1_5 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2770

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 18000

Training results

Training Loss Epoch Step Validation Loss
2.3757 0.04 100 2.0747
2.0269 0.08 200 1.9990
1.9535 0.12 300 1.9450
1.9136 0.16 400 1.9067
1.892 0.2 500 1.8757
1.8753 0.24 600 1.8574
1.8507 0.28 700 1.8359
1.8759 0.32 800 1.8167
1.8166 0.36 900 1.8054
1.8224 0.4 1000 1.7818
1.7852 0.44 1100 1.7814
1.8164 0.48 1200 1.7664
1.7632 0.52 1300 1.7598
1.8485 0.56 1400 1.7439
1.7712 0.6 1500 1.7303
1.7632 0.64 1600 1.7277
1.7378 0.68 1700 1.7135
1.7581 0.72 1800 1.7075
1.7261 0.76 1900 1.6933
1.7243 0.8 2000 1.6891
1.7311 0.84 2100 1.6837
1.7554 0.88 2200 1.6808
1.7026 0.92 2300 1.6646
1.7193 0.96 2400 1.6664
1.6861 1.0 2500 1.6577
1.68 1.04 2600 1.6470
1.5931 1.08 2700 1.6425
1.6655 1.12 2800 1.6352
1.629 1.16 2900 1.6298
1.6567 1.2 3000 1.6236
1.6225 1.24 3100 1.6242
1.6249 1.28 3200 1.6150
1.6263 1.32 3300 1.6077
1.6055 1.36 3400 1.6034
1.6338 1.4 3500 1.5996
1.6032 1.44 3600 1.5947
1.6447 1.48 3700 1.5882
1.6063 1.52 3800 1.5877
1.5933 1.56 3900 1.5850
1.6267 1.6 4000 1.5814
1.6151 1.64 4100 1.5709
1.6047 1.68 4200 1.5683
1.5811 1.72 4300 1.5661
1.5877 1.76 4400 1.5648
1.6321 1.8 4500 1.5645
1.5969 1.84 4600 1.5584
1.5971 1.88 4700 1.5565
1.622 1.92 4800 1.5547
1.6265 1.96 4900 1.5496
1.6145 2.0 5000 1.5466
1.526 2.04 5100 1.5427
1.5793 2.08 5200 1.5390
1.5714 2.12 5300 1.5375
1.5228 2.16 5400 1.5360
1.5383 2.2 5500 1.5343
1.5117 2.24 5600 1.5322
1.5427 2.28 5700 1.5316
1.4959 2.32 5800 1.5306
1.5456 2.36 5900 1.5299
1.5175 2.4 6000 1.5295
1.5823 2.44 6100 1.5498
1.5615 2.48 6200 1.5447
1.5326 2.52 6300 1.5463
1.567 2.56 6400 1.5450
1.5243 2.6 6500 1.5456
1.5214 2.64 6600 1.5383
1.6086 2.68 6700 1.5393
1.5391 2.72 6800 1.5285
1.5224 2.76 6900 1.5318
1.5567 2.8 7000 1.5292
1.5525 2.84 7100 1.5207
1.5399 2.88 7200 1.5135
1.5399 2.92 7300 1.5104
1.5765 2.96 7400 1.5085
1.556 3.0 7500 1.5042
1.4977 3.04 7600 1.4997
1.4818 3.08 7700 1.4930
1.4912 3.12 7800 1.4908
1.517 3.16 7900 1.4933
1.4971 3.2 8000 1.4857
1.4827 3.24 8100 1.4805
1.5096 3.28 8200 1.4804
1.4788 3.32 8300 1.4756
1.457 3.36 8400 1.4728
1.4819 3.4 8500 1.4717
1.5241 3.44 8600 1.4678
1.5081 3.48 8700 1.4676
1.5173 3.52 8800 1.4657
1.4765 3.56 8900 1.4643
1.4691 3.6 9000 1.4603
1.5034 3.64 9100 1.4577
1.4997 3.68 9200 1.4552
1.4849 3.72 9300 1.4504
1.5144 3.76 9400 1.4518
1.4972 3.8 9500 1.4469
1.4695 3.84 9600 1.4474
1.5088 3.88 9700 1.4468
1.4772 3.92 9800 1.4418
1.5207 3.96 9900 1.4390
1.5088 4.0 10000 1.4378
1.4915 4.04 10100 1.4324
1.4356 4.08 10200 1.4305
1.4388 4.12 10300 1.4268
1.4004 4.16 10400 1.4251
1.3909 4.2 10500 1.4225
1.4284 4.24 10600 1.4218
1.4422 4.28 10700 1.4213
1.4301 4.32 10800 1.4198
1.4309 4.36 10900 1.4174
1.415 4.4 11000 1.4147
1.4697 4.44 11100 1.4136
1.4241 4.48 11200 1.4123
1.4416 4.52 11300 1.4100
1.4229 4.56 11400 1.4094
1.4498 4.6 11500 1.4091
1.4023 4.64 11600 1.4083
1.4197 4.68 11700 1.4075
1.4165 4.72 11800 1.4070
1.4103 4.76 11900 1.4067
1.4214 4.8 12000 1.4066
1.4223 9.68 12100 1.4162
1.4471 9.76 12200 1.4210
1.4165 9.84 12300 1.4154
1.4088 9.92 12400 1.4105
1.4057 10.0 12500 1.4100
1.3778 10.08 12600 1.4034
1.4081 10.16 12700 1.4055
1.4127 10.24 12800 1.4001
1.4282 10.32 12900 1.3924
1.4069 10.4 13000 1.3909
1.4097 10.48 13100 1.3885
1.4173 10.56 13200 1.3824
1.4282 10.64 13300 1.3798
1.4266 10.72 13400 1.3778
1.4205 10.8 13500 1.3760
1.4347 10.88 13600 1.3730
1.4088 10.96 13700 1.3659
1.3859 11.04 13800 1.3605
1.3711 11.12 13900 1.3572
1.3896 11.2 14000 1.3550
1.343 11.28 14100 1.3510
1.3866 11.36 14200 1.3485
1.3603 11.44 14300 1.3468
1.3881 11.52 14400 1.3448
1.3841 11.6 14500 1.3422
1.358 11.68 14600 1.3379
1.3704 11.76 14700 1.3352
1.3656 11.84 14800 1.3350
1.367 11.92 14900 1.3299
1.3765 12.0 15000 1.3302
1.32 12.08 15100 1.3240
1.343 12.16 15200 1.3186
1.3254 12.24 15300 1.3159
1.3433 12.32 15400 1.3134
1.3347 12.4 15500 1.3113
1.3304 12.48 15600 1.3110
1.3235 12.56 15700 1.3106
1.3099 12.64 15800 1.3056
1.3176 12.72 15900 1.3027
1.3613 12.8 16000 1.3057
1.3238 12.88 16100 1.3006
1.354 12.96 16200 1.3003
1.3324 13.04 16300 1.2967
1.322 13.12 16400 1.2945
1.3029 13.2 16500 1.2898
1.317 13.28 16600 1.2892
1.2982 13.36 16700 1.2882
1.3092 13.44 16800 1.2878
1.3161 13.52 16900 1.2866
1.2895 13.6 17000 1.2844
1.28 13.68 17100 1.2834
1.2849 13.76 17200 1.2822
1.3136 13.84 17300 1.2828
1.2938 13.92 17400 1.2810
1.2994 14.0 17500 1.2803
1.3158 14.08 17600 1.2788
1.2783 14.16 17700 1.2779
1.2811 14.24 17800 1.2774
1.2824 14.32 17900 1.2771
1.2881 14.4 18000 1.2770

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.6
  • Tokenizers 0.14.1