gpt2-alpaca-pandalm / README.md
Hikiyo's picture
update model card README.md
3e3232e
|
raw
history blame
19.9 kB
metadata
license: mit
base_model: vicgalle/gpt2-alpaca-gpt4
tags:
  - generated_from_trainer
model-index:
  - name: gpt2-alpaca-pandalm
    results: []

gpt2-alpaca-pandalm

This model is a fine-tuned version of vicgalle/gpt2-alpaca-gpt4 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8219

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
No log 0.0 200 2.2796
No log 0.01 400 1.7930
No log 0.01 600 1.2870
No log 0.01 800 1.1460
No log 0.01 1000 1.0742
No log 0.02 1200 1.0431
No log 0.02 1400 1.0250
No log 0.02 1600 1.0108
No log 0.03 1800 0.9998
No log 0.03 2000 0.9937
No log 0.03 2200 0.9791
No log 0.03 2400 0.9817
No log 0.04 2600 0.9617
No log 0.04 2800 0.9580
1.2199 0.04 3000 0.9757
1.2199 0.04 3200 0.9541
1.2199 0.05 3400 0.9548
1.2199 0.05 3600 0.9485
1.2199 0.05 3800 0.9395
1.2199 0.06 4000 0.9413
1.2199 0.06 4200 0.9336
1.2199 0.06 4400 0.9369
1.2199 0.06 4600 0.9346
1.2199 0.07 4800 0.9277
1.2199 0.07 5000 0.9255
1.2199 0.07 5200 0.9253
1.2199 0.08 5400 0.9152
1.2199 0.08 5600 0.9203
1.2199 0.08 5800 0.9244
0.9222 0.08 6000 0.9178
0.9222 0.09 6200 0.9230
0.9222 0.09 6400 0.9109
0.9222 0.09 6600 0.9132
0.9222 0.09 6800 0.9159
0.9222 0.1 7000 0.9090
0.9222 0.1 7200 0.9073
0.9222 0.1 7400 0.9115
0.9222 0.11 7600 0.9125
0.9222 0.11 7800 0.9087
0.9222 0.11 8000 0.9103
0.9222 0.11 8200 0.9061
0.9222 0.12 8400 0.9047
0.9222 0.12 8600 0.9025
0.9222 0.12 8800 0.9023
0.8883 0.13 9000 0.8949
0.8883 0.13 9200 0.8939
0.8883 0.13 9400 0.8942
0.8883 0.13 9600 0.8993
0.8883 0.14 9800 0.8925
0.8883 0.14 10000 0.8891
0.8883 0.14 10200 0.8874
0.8883 0.15 10400 0.8941
0.8883 0.15 10600 0.8905
0.8883 0.15 10800 0.8863
0.8883 0.15 11000 0.8916
0.8883 0.16 11200 0.8902
0.8883 0.16 11400 0.8851
0.8883 0.16 11600 0.8832
0.8883 0.16 11800 0.8824
0.8719 0.17 12000 0.8793
0.8719 0.17 12200 0.8797
0.8719 0.17 12400 0.8810
0.8719 0.18 12600 0.8796
0.8719 0.18 12800 0.8749
0.8719 0.18 13000 0.8740
0.8719 0.18 13200 0.8757
0.8719 0.19 13400 0.8767
0.8719 0.19 13600 0.8778
0.8719 0.19 13800 0.8793
0.8719 0.2 14000 0.8776
0.8719 0.2 14200 0.8740
0.8719 0.2 14400 0.8731
0.8719 0.2 14600 0.8729
0.8719 0.21 14800 0.8733
0.8605 0.21 15000 0.8739
0.8605 0.21 15200 0.8669
0.8605 0.21 15400 0.8629
0.8605 0.22 15600 0.8673
0.8605 0.22 15800 0.8653
0.8605 0.22 16000 0.8703
0.8605 0.23 16200 0.8685
0.8605 0.23 16400 0.8693
0.8605 0.23 16600 0.8684
0.8605 0.23 16800 0.8629
0.8605 0.24 17000 0.8643
0.8605 0.24 17200 0.8625
0.8605 0.24 17400 0.8604
0.8605 0.25 17600 0.8599
0.8605 0.25 17800 0.8617
0.8537 0.25 18000 0.8618
0.8537 0.25 18200 0.8608
0.8537 0.26 18400 0.8626
0.8537 0.26 18600 0.8607
0.8537 0.26 18800 0.8577
0.8537 0.26 19000 0.8584
0.8537 0.27 19200 0.8597
0.8537 0.27 19400 0.8561
0.8537 0.27 19600 0.8578
0.8537 0.28 19800 0.8545
0.8537 0.28 20000 0.8539
0.8537 0.28 20200 0.8578
0.8537 0.28 20400 0.8536
0.8537 0.29 20600 0.8527
0.8537 0.29 20800 0.8551
0.8472 0.29 21000 0.8542
0.8472 0.3 21200 0.8547
0.8472 0.3 21400 0.8528
0.8472 0.3 21600 0.8540
0.8472 0.3 21800 0.8503
0.8472 0.31 22000 0.8498
0.8472 0.31 22200 0.8502
0.8472 0.31 22400 0.8522
0.8472 0.32 22600 0.8499
0.8472 0.32 22800 0.8511
0.8472 0.32 23000 0.8503
0.8472 0.32 23200 0.8498
0.8472 0.33 23400 0.8463
0.8472 0.33 23600 0.8488
0.8472 0.33 23800 0.8510
0.8421 0.33 24000 0.8479
0.8421 0.34 24200 0.8486
0.8421 0.34 24400 0.8485
0.8421 0.34 24600 0.8484
0.8421 0.35 24800 0.8495
0.8421 0.35 25000 0.8475
0.8421 0.35 25200 0.8484
0.8421 0.35 25400 0.8479
0.8421 0.36 25600 0.8479
0.8421 0.36 25800 0.8452
0.8421 0.36 26000 0.8481
0.8421 0.37 26200 0.8479
0.8421 0.37 26400 0.8442
0.8421 0.37 26600 0.8441
0.8421 0.37 26800 0.8440
0.8377 0.38 27000 0.8412
0.8377 0.38 27200 0.8421
0.8377 0.38 27400 0.8432
0.8377 0.38 27600 0.8425
0.8377 0.39 27800 0.8432
0.8377 0.39 28000 0.8432
0.8377 0.39 28200 0.8421
0.8377 0.4 28400 0.8435
0.8377 0.4 28600 0.8432
0.8377 0.4 28800 0.8417
0.8377 0.4 29000 0.8403
0.8377 0.41 29200 0.8444
0.8377 0.41 29400 0.8425
0.8377 0.41 29600 0.8422
0.8377 0.42 29800 0.8438
0.8291 0.42 30000 0.8399
0.8291 0.42 30200 0.8450
0.8291 0.42 30400 0.8421
0.8291 0.43 30600 0.8402
0.8291 0.43 30800 0.8441
0.8291 0.43 31000 0.8418
0.8291 0.44 31200 0.8422
0.8291 0.44 31400 0.8376
0.8291 0.44 31600 0.8386
0.8291 0.44 31800 0.8412
0.8291 0.45 32000 0.8447
0.8291 0.45 32200 0.8428
0.8291 0.45 32400 0.8409
0.8291 0.45 32600 0.8375
0.8291 0.46 32800 0.8354
0.8279 0.46 33000 0.8360
0.8279 0.46 33200 0.8373
0.8279 0.47 33400 0.8372
0.8279 0.47 33600 0.8393
0.8279 0.47 33800 0.8363
0.8279 0.47 34000 0.8370
0.8279 0.48 34200 0.8359
0.8279 0.48 34400 0.8336
0.8279 0.48 34600 0.8334
0.8279 0.49 34800 0.8322
0.8279 0.49 35000 0.8326
0.8279 0.49 35200 0.8315
0.8279 0.49 35400 0.8354
0.8279 0.5 35600 0.8360
0.8279 0.5 35800 0.8321
0.8254 0.5 36000 0.8341
0.8254 0.5 36200 0.8350
0.8254 0.51 36400 0.8344
0.8254 0.51 36600 0.8335
0.8254 0.51 36800 0.8337
0.8254 0.52 37000 0.8305
0.8254 0.52 37200 0.8308
0.8254 0.52 37400 0.8319
0.8254 0.52 37600 0.8320
0.8254 0.53 37800 0.8292
0.8254 0.53 38000 0.8316
0.8254 0.53 38200 0.8329
0.8254 0.54 38400 0.8314
0.8254 0.54 38600 0.8301
0.8254 0.54 38800 0.8319
0.822 0.54 39000 0.8325
0.822 0.55 39200 0.8313
0.822 0.55 39400 0.8305
0.822 0.55 39600 0.8303
0.822 0.55 39800 0.8283
0.822 0.56 40000 0.8315
0.822 0.56 40200 0.8280
0.822 0.56 40400 0.8316
0.822 0.57 40600 0.8303
0.822 0.57 40800 0.8317
0.822 0.57 41000 0.8302
0.822 0.57 41200 0.8298
0.822 0.58 41400 0.8313
0.822 0.58 41600 0.8304
0.822 0.58 41800 0.8289
0.819 0.59 42000 0.8293
0.819 0.59 42200 0.8315
0.819 0.59 42400 0.8250
0.819 0.59 42600 0.8264
0.819 0.6 42800 0.8282
0.819 0.6 43000 0.8290
0.819 0.6 43200 0.8283
0.819 0.61 43400 0.8291
0.819 0.61 43600 0.8271
0.819 0.61 43800 0.8261
0.819 0.61 44000 0.8276
0.819 0.62 44200 0.8274
0.819 0.62 44400 0.8279
0.819 0.62 44600 0.8264
0.819 0.62 44800 0.8275
0.8203 0.63 45000 0.8266
0.8203 0.63 45200 0.8259
0.8203 0.63 45400 0.8277
0.8203 0.64 45600 0.8269
0.8203 0.64 45800 0.8261
0.8203 0.64 46000 0.8245
0.8203 0.64 46200 0.8243
0.8203 0.65 46400 0.8242
0.8203 0.65 46600 0.8244
0.8203 0.65 46800 0.8237
0.8203 0.66 47000 0.8247
0.8203 0.66 47200 0.8238
0.8203 0.66 47400 0.8239
0.8203 0.66 47600 0.8252
0.8203 0.67 47800 0.8267
0.8169 0.67 48000 0.8245
0.8169 0.67 48200 0.8251
0.8169 0.67 48400 0.8247
0.8169 0.68 48600 0.8252
0.8169 0.68 48800 0.8259
0.8169 0.68 49000 0.8244
0.8169 0.69 49200 0.8245
0.8169 0.69 49400 0.8260
0.8169 0.69 49600 0.8265
0.8169 0.69 49800 0.8258
0.8169 0.7 50000 0.8274
0.8169 0.7 50200 0.8287
0.8169 0.7 50400 0.8280
0.8169 0.71 50600 0.8266
0.8169 0.71 50800 0.8259
0.8153 0.71 51000 0.8263
0.8153 0.71 51200 0.8260
0.8153 0.72 51400 0.8258
0.8153 0.72 51600 0.8251
0.8153 0.72 51800 0.8250
0.8153 0.73 52000 0.8254
0.8153 0.73 52200 0.8244
0.8153 0.73 52400 0.8236
0.8153 0.73 52600 0.8234
0.8153 0.74 52800 0.8251
0.8153 0.74 53000 0.8246
0.8153 0.74 53200 0.8248
0.8153 0.74 53400 0.8236
0.8153 0.75 53600 0.8243
0.8153 0.75 53800 0.8255
0.8123 0.75 54000 0.8246
0.8123 0.76 54200 0.8235
0.8123 0.76 54400 0.8235
0.8123 0.76 54600 0.8235
0.8123 0.76 54800 0.8238
0.8123 0.77 55000 0.8242
0.8123 0.77 55200 0.8233
0.8123 0.77 55400 0.8236
0.8123 0.78 55600 0.8226
0.8123 0.78 55800 0.8225
0.8123 0.78 56000 0.8220
0.8123 0.78 56200 0.8228
0.8123 0.79 56400 0.8230
0.8123 0.79 56600 0.8226
0.8123 0.79 56800 0.8223
0.8106 0.79 57000 0.8229
0.8106 0.8 57200 0.8225
0.8106 0.8 57400 0.8229
0.8106 0.8 57600 0.8230
0.8106 0.81 57800 0.8234
0.8106 0.81 58000 0.8230
0.8106 0.81 58200 0.8231
0.8106 0.81 58400 0.8227
0.8106 0.82 58600 0.8227
0.8106 0.82 58800 0.8213
0.8106 0.82 59000 0.8209
0.8106 0.83 59200 0.8213
0.8106 0.83 59400 0.8214
0.8106 0.83 59600 0.8219
0.8106 0.83 59800 0.8220
0.813 0.84 60000 0.8214
0.813 0.84 60200 0.8217
0.813 0.84 60400 0.8217
0.813 0.85 60600 0.8221
0.813 0.85 60800 0.8227
0.813 0.85 61000 0.8225
0.813 0.85 61200 0.8226
0.813 0.86 61400 0.8218
0.813 0.86 61600 0.8223
0.813 0.86 61800 0.8229
0.813 0.86 62000 0.8224
0.813 0.87 62200 0.8222
0.813 0.87 62400 0.8222
0.813 0.87 62600 0.8224
0.813 0.88 62800 0.8224
0.8073 0.88 63000 0.8226
0.8073 0.88 63200 0.8222
0.8073 0.88 63400 0.8219
0.8073 0.89 63600 0.8216
0.8073 0.89 63800 0.8214
0.8073 0.89 64000 0.8212
0.8073 0.9 64200 0.8214
0.8073 0.9 64400 0.8216
0.8073 0.9 64600 0.8217
0.8073 0.9 64800 0.8219
0.8073 0.91 65000 0.8217
0.8073 0.91 65200 0.8217
0.8073 0.91 65400 0.8217
0.8073 0.91 65600 0.8219
0.8073 0.92 65800 0.8219
0.8095 0.92 66000 0.8217
0.8095 0.92 66200 0.8218
0.8095 0.93 66400 0.8218
0.8095 0.93 66600 0.8217
0.8095 0.93 66800 0.8217
0.8095 0.93 67000 0.8216
0.8095 0.94 67200 0.8217
0.8095 0.94 67400 0.8218
0.8095 0.94 67600 0.8218
0.8095 0.95 67800 0.8218
0.8095 0.95 68000 0.8217
0.8095 0.95 68200 0.8218
0.8095 0.95 68400 0.8218
0.8095 0.96 68600 0.8219
0.8095 0.96 68800 0.8219
0.8086 0.96 69000 0.8218
0.8086 0.96 69200 0.8218
0.8086 0.97 69400 0.8219
0.8086 0.97 69600 0.8218
0.8086 0.97 69800 0.8219
0.8086 0.98 70000 0.8219
0.8086 0.98 70200 0.8219
0.8086 0.98 70400 0.8219
0.8086 0.98 70600 0.8219
0.8086 0.99 70800 0.8219
0.8086 0.99 71000 0.8219
0.8086 0.99 71200 0.8219
0.8086 1.0 71400 0.8219
0.8086 1.0 71600 0.8219

Framework versions

  • Transformers 4.31.0
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.13.3