BERT_WordPiece_wikitext

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6158

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • training_steps: 1000000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.863 0.3019 2000 6.7607
6.5419 0.6039 4000 6.4556
6.1851 0.9058 6000 6.0700
4.7743 1.2077 8000 4.5895
4.1838 1.5097 10000 3.9989
3.7908 1.8116 12000 3.6092
3.5306 2.1135 14000 3.3678
3.3543 2.4155 16000 3.2261
3.2495 2.7174 18000 3.1645
3.1317 3.0193 20000 3.0059
3.0525 3.3213 22000 2.9423
2.9949 3.6232 24000 2.8740
2.9194 3.9251 26000 2.8131
2.8713 4.2271 28000 2.7365
2.8181 4.5290 30000 2.7355
2.7905 4.8309 32000 2.7272
2.7416 5.1329 34000 2.6691
2.7184 5.4348 36000 2.6006
2.688 5.7367 38000 2.5965
2.6406 6.0386 40000 2.5306
2.6169 6.3406 42000 2.5166
2.5934 6.6425 44000 2.4899
2.5698 6.9444 46000 2.4968
2.5438 7.2464 48000 2.4720
2.5208 7.5483 50000 2.4320
2.5101 7.8502 52000 2.4366
2.4934 8.1522 54000 2.4073
2.4691 8.4541 56000 2.4067
2.4563 8.7560 58000 2.3467
2.4371 9.0580 60000 2.3657
2.4118 9.3599 62000 2.3741
2.4145 9.6618 64000 2.2947
2.4134 9.9638 66000 2.2967
2.3684 10.2657 68000 2.3218
2.3681 10.5676 70000 2.3107
2.3504 10.8696 72000 2.2708
2.3351 11.1715 74000 2.2793
2.3273 11.4734 76000 2.2305
2.319 11.7754 78000 2.2447
2.2995 12.0773 80000 2.2363
2.288 12.3792 82000 2.2267
2.2976 12.6812 84000 2.2389
2.2933 12.9831 86000 2.2228
2.2617 13.2850 88000 2.2064
2.2627 13.5870 90000 2.2009
2.2624 13.8889 92000 2.1846
2.2421 14.1908 94000 2.1673
2.2192 14.4928 96000 2.1963
2.2307 14.7947 98000 2.1438
2.2098 15.0966 100000 2.1745
2.2048 15.3986 102000 2.1436
2.2026 15.7005 104000 2.1100
2.193 16.0024 106000 2.1384
2.1941 16.3043 108000 2.1376
2.1666 16.6063 110000 2.1412
2.1751 16.9082 112000 2.1320
2.16 17.2101 114000 2.1062
2.1604 17.5121 116000 2.1162
2.152 17.8140 118000 2.1199
2.1205 18.1159 120000 2.0868
2.1303 18.4179 122000 2.0792
2.1401 18.7198 124000 2.1032
2.1089 19.0217 126000 2.0720
2.1199 19.3237 128000 2.0739
2.1052 19.6256 130000 2.0821
2.1124 19.9275 132000 2.0692
2.0942 20.2295 134000 2.0647
2.0837 20.5314 136000 2.0412
2.0851 20.8333 138000 2.0587
2.0716 21.1353 140000 2.0387
2.0726 21.4372 142000 2.0561
2.0699 21.7391 144000 2.0362
2.0649 22.0411 146000 2.0262
2.0451 22.3430 148000 2.0384
2.0585 22.6449 150000 2.0216
2.0506 22.9469 152000 2.0146
2.0379 23.2488 154000 2.0229
2.0318 23.5507 156000 2.0047
2.0535 23.8527 158000 1.9898
2.0144 24.1546 160000 2.0137
2.0144 24.4565 162000 1.9826
2.0232 24.7585 164000 1.9941
2.0115 25.0604 166000 1.9809
2.0113 25.3623 168000 1.9985
2.0067 25.6643 170000 2.0048
1.9999 25.9662 172000 1.9766
1.9904 26.2681 174000 1.9681
1.9883 26.5700 176000 1.9903
1.995 26.8720 178000 1.9802
1.9856 27.1739 180000 1.9316
1.9686 27.4758 182000 2.0114
1.9761 27.7778 184000 1.9609
1.9586 28.0797 186000 1.9555
1.9566 28.3816 188000 1.9404
1.9604 28.6836 190000 1.9391
1.9625 28.9855 192000 1.9281
1.9518 29.2874 194000 1.9319
1.9548 29.5894 196000 1.9135
1.9602 29.8913 198000 1.9234
1.9348 30.1932 200000 1.9225
1.9382 30.4952 202000 1.9206
1.9457 30.7971 204000 1.9474
1.9209 31.0990 206000 1.9438
1.9244 31.4010 208000 1.9028
1.9217 31.7029 210000 1.9385
1.9189 32.0048 212000 1.9252
1.9173 32.3068 214000 1.9238
1.9194 32.6087 216000 1.9088
1.9161 32.9106 218000 1.8923
1.8938 33.2126 220000 1.9163
1.9032 33.5145 222000 1.8847
1.9048 33.8164 224000 1.8739
1.8848 34.1184 226000 1.9108
1.8947 34.4203 228000 1.8684
1.8972 34.7222 230000 1.9014
1.8786 35.0242 232000 1.8895
1.8851 35.3261 234000 1.8557
1.8738 35.6280 236000 1.8913
1.887 35.9300 238000 1.8585
1.8696 36.2319 240000 1.9004
1.8704 36.5338 242000 1.8849
1.8673 36.8357 244000 1.8668
1.8593 37.1377 246000 1.8919
1.8725 37.4396 248000 1.8789
1.8484 37.7415 250000 1.8609
1.8512 38.0435 252000 1.8685
1.8536 38.3454 254000 1.8561
1.8504 38.6473 256000 1.8426
1.8531 38.9493 258000 1.8693
1.8396 39.2512 260000 1.8459
1.8308 39.5531 262000 1.8567
1.8522 39.8551 264000 1.8476
1.8289 40.1570 266000 1.8348
1.8403 40.4589 268000 1.8481
1.8354 40.7609 270000 1.8343
1.8157 41.0628 272000 1.8377
1.8166 41.3647 274000 1.8699
1.8196 41.6667 276000 1.8421
1.8284 41.9686 278000 1.8476
1.8188 42.2705 280000 1.8455
1.8081 42.5725 282000 1.8256
1.8112 42.8744 284000 1.8167
1.811 43.1763 286000 1.8184
1.7994 43.4783 288000 1.8489
1.8055 43.7802 290000 1.8124
1.7945 44.0821 292000 1.8209
1.796 44.3841 294000 1.7912
1.7952 44.6860 296000 1.8288
1.7892 44.9879 298000 1.8146
1.7874 45.2899 300000 1.8286
1.7969 45.5918 302000 1.8237
1.7882 45.8937 304000 1.8113
1.7754 46.1957 306000 1.7926
1.7814 46.4976 308000 1.8164
1.7782 46.7995 310000 1.8181
1.769 47.1014 312000 1.7984
1.7656 47.4034 314000 1.8227
1.7701 47.7053 316000 1.7812
1.7584 48.0072 318000 1.8072
1.7668 48.3092 320000 1.7956
1.7701 48.6111 322000 1.7998
1.7722 48.9130 324000 1.8078
1.7496 49.2150 326000 1.8007
1.7535 49.5169 328000 1.7612
1.7655 49.8188 330000 1.7823
1.7499 50.1208 332000 1.7911
1.7487 50.4227 334000 1.7912
1.7526 50.7246 336000 1.7910
1.7461 51.0266 338000 1.8090
1.7442 51.3285 340000 1.7728
1.7403 51.6304 342000 1.7769
1.7364 51.9324 344000 1.7752
1.7339 52.2343 346000 1.7680
1.7421 52.5362 348000 1.7640
1.7283 52.8382 350000 1.7757
1.7183 53.1401 352000 1.7801
1.7297 53.4420 354000 1.7825
1.7364 53.7440 356000 1.7651
1.7246 54.0459 358000 1.7520
1.7213 54.3478 360000 1.7705
1.7335 54.6498 362000 1.7754
1.7311 54.9517 364000 1.7632
1.7195 55.2536 366000 1.7543
1.7184 55.5556 368000 1.7681
1.7111 55.8575 370000 1.7286
1.82 56.1594 372000 1.8323
1.8329 56.4614 374000 1.8618
1.856 56.7633 376000 1.8570
1.8353 57.0652 378000 1.8580
1.8463 57.3671 380000 1.8732
1.8458 57.6691 382000 1.8708
1.8534 57.9710 384000 1.8566
1.8422 58.2729 386000 1.8641
1.8364 58.5749 388000 1.8907
1.8634 58.8768 390000 1.8599
1.8396 59.1787 392000 1.8701
1.8358 59.4807 394000 1.8690
1.844 59.7826 396000 1.8382
1.8369 60.0845 398000 1.8618
1.8496 60.3865 400000 1.8499
1.8471 60.6884 402000 1.8397
1.8483 60.9903 404000 1.8343
1.832 61.2923 406000 1.8476
1.8396 61.5942 408000 1.8430
1.8359 61.8961 410000 1.8596
1.8342 62.1981 412000 1.8563
1.8343 62.5 414000 1.8413
1.8255 62.8019 416000 1.8321
1.8212 63.1039 418000 1.8531
1.827 63.4058 420000 1.8359
1.8411 63.7077 422000 1.8289
1.8143 64.0097 424000 1.8541
1.8229 64.3116 426000 1.8209
1.8297 64.6135 428000 1.8231
1.8231 64.9155 430000 1.8267
1.7933 65.2174 432000 1.8477
1.8151 65.5193 434000 1.8275
1.8289 65.8213 436000 1.8237
1.8036 66.1232 438000 1.8355
1.8154 66.4251 440000 1.8291
1.8119 66.7271 442000 1.8519
1.8042 67.0290 444000 1.8215
1.8039 67.3309 446000 1.8247
1.8102 67.6329 448000 1.8313
1.8105 67.9348 450000 1.8226
1.795 68.2367 452000 1.8268
1.8141 68.5386 454000 1.8318
1.8035 68.8406 456000 1.8010
1.7921 69.1425 458000 1.8124
1.8047 69.4444 460000 1.8202
1.8004 69.7464 462000 1.8058
1.7897 70.0483 464000 1.8264
1.7943 70.3502 466000 1.8119
1.7881 70.6522 468000 1.8335
1.7904 70.9541 470000 1.7976
1.7891 71.2560 472000 1.8160
1.7968 71.5580 474000 1.8210
1.7999 71.8599 476000 1.8129
1.782 72.1618 478000 1.8294
1.7733 72.4638 480000 1.8410
1.7757 72.7657 482000 1.8220
1.7683 73.0676 484000 1.8330
1.776 73.3696 486000 1.8045
1.7712 73.6715 488000 1.8084
1.7707 73.9734 490000 1.8097
1.7781 74.2754 492000 1.8111
1.7816 74.5773 494000 1.7739
1.7772 74.8792 496000 1.8006
1.7601 75.1812 498000 1.8229
1.7634 75.4831 500000 1.7799
1.7699 75.7850 502000 1.7981
1.7549 76.0870 504000 1.8058
1.7576 76.3889 506000 1.7838
1.7548 76.6908 508000 1.8078
1.764 76.9928 510000 1.7774
1.7574 77.2947 512000 1.7746
1.7608 77.5966 514000 1.7828
1.7528 77.8986 516000 1.8012
1.7351 78.2005 518000 1.7995
1.7674 78.5024 520000 1.7801
1.7474 78.8043 522000 1.7901
1.7411 79.1063 524000 1.7887
1.7624 79.4082 526000 1.7978
1.7331 79.7101 528000 1.7882
1.7215 80.0121 530000 1.7697
1.7385 80.3140 532000 1.7543
1.7433 80.6159 534000 1.8068
1.7441 80.9179 536000 1.7747
1.7398 81.2198 538000 1.7500
1.7451 81.5217 540000 1.7908
1.7477 81.8237 542000 1.7848
1.7315 82.1256 544000 1.7688
1.7232 82.4275 546000 1.7824
1.743 82.7295 548000 1.7811
1.7229 83.0314 550000 1.7710
1.7346 83.3333 552000 1.7652
1.7323 83.6353 554000 1.7356
1.7426 83.9372 556000 1.7642
1.7154 84.2391 558000 1.7629
1.7236 84.5411 560000 1.7470
1.7208 84.8430 562000 1.7751
1.7128 85.1449 564000 1.7509
1.7253 85.4469 566000 1.7594
1.7107 85.7488 568000 1.7609
1.7089 86.0507 570000 1.7518
1.7129 86.3527 572000 1.7357
1.7175 86.6546 574000 1.7548
1.7129 86.9565 576000 1.7454
1.7146 87.2585 578000 1.7701
1.7026 87.5604 580000 1.7506
1.7079 87.8623 582000 1.7537
1.6948 88.1643 584000 1.7668
1.708 88.4662 586000 1.7670
1.7077 88.7681 588000 1.7412
1.6943 89.0700 590000 1.7890
1.7166 89.3720 592000 1.7562
1.7003 89.6739 594000 1.7564
1.7165 89.9758 596000 1.7400
1.6949 90.2778 598000 1.7276
1.7068 90.5797 600000 1.7129
1.7035 90.8816 602000 1.7465
1.6766 91.1836 604000 1.7260
1.685 91.4855 606000 1.6977
1.6963 91.7874 608000 1.7574
1.6689 92.0894 610000 1.7326
1.6814 92.3913 612000 1.7468
1.6913 92.6932 614000 1.7330
1.6899 92.9952 616000 1.7487
1.6802 93.2971 618000 1.7206
1.6766 93.5990 620000 1.7478
1.6862 93.9010 622000 1.7332
1.6769 94.2029 624000 1.7262
1.6695 94.5048 626000 1.7498
1.6797 94.8068 628000 1.7240
1.6781 95.1087 630000 1.7668
1.6668 95.4106 632000 1.7320
1.6716 95.7126 634000 1.7078
1.6661 96.0145 636000 1.7566
1.6731 96.3164 638000 1.7435
1.6694 96.6184 640000 1.7371
1.6758 96.9203 642000 1.7184
1.6581 97.2222 644000 1.7352
1.6638 97.5242 646000 1.7119
1.65 97.8261 648000 1.7162
1.6568 98.1280 650000 1.7276
1.6536 98.4300 652000 1.7220
1.6594 98.7319 654000 1.7399
1.6437 99.0338 656000 1.7415
1.6613 99.3357 658000 1.7349
1.6599 99.6377 660000 1.7117
1.6638 99.9396 662000 1.7193
1.6526 100.2415 664000 1.7299
1.646 100.5435 666000 1.7205
1.6519 100.8454 668000 1.7275
1.6416 101.1473 670000 1.7360
1.6505 101.4493 672000 1.7357
1.6508 101.7512 674000 1.6889
1.6496 102.0531 676000 1.7161
1.6489 102.3551 678000 1.6932
1.6424 102.6570 680000 1.7244
1.6479 102.9589 682000 1.7139
1.6489 103.2609 684000 1.7185
1.6364 103.5628 686000 1.7183
1.6471 103.8647 688000 1.7009
1.6399 104.1667 690000 1.7025
1.6421 104.4686 692000 1.7204
1.6416 104.7705 694000 1.7431
1.6214 105.0725 696000 1.7183
1.6372 105.3744 698000 1.7223
1.6289 105.6763 700000 1.7158
1.6427 105.9783 702000 1.7024
1.6259 106.2802 704000 1.6860
1.6407 106.5821 706000 1.7199
1.6279 106.8841 708000 1.7305
1.623 107.1860 710000 1.7155
1.6259 107.4879 712000 1.6843
1.6322 107.7899 714000 1.7109
1.6144 108.0918 716000 1.6866
1.6017 108.3937 718000 1.7108
1.6202 108.6957 720000 1.6927
1.6154 108.9976 722000 1.7069
1.6079 109.2995 724000 1.7214
1.6223 109.6014 726000 1.7107
1.6344 109.9034 728000 1.7085
1.6121 110.2053 730000 1.6810
1.6147 110.5072 732000 1.7044
1.6138 110.8092 734000 1.6703
1.6024 111.1111 736000 1.6978
1.6165 111.4130 738000 1.7128
1.6111 111.7150 740000 1.7045
1.6086 112.0169 742000 1.6872
1.6054 112.3188 744000 1.7017
1.623 112.6208 746000 1.6943
1.6111 112.9227 748000 1.6780
1.5929 113.2246 750000 1.7081
1.597 113.5266 752000 1.6822
1.6099 113.8285 754000 1.6914
1.5902 114.1304 756000 1.6846
1.603 114.4324 758000 1.6850
1.5921 114.7343 760000 1.6839
1.5981 115.0362 762000 1.7139
1.5996 115.3382 764000 1.6749
1.5996 115.6401 766000 1.6681
1.5946 115.9420 768000 1.6875
1.5908 116.2440 770000 1.6727
1.5913 116.5459 772000 1.6889
1.5942 116.8478 774000 1.7008
1.5967 117.1498 776000 1.6730
1.581 117.4517 778000 1.6950
1.5841 117.7536 780000 1.6924
1.5804 118.0556 782000 1.6990
1.5889 118.3575 784000 1.6894
1.5895 118.6594 786000 1.6736
1.5886 118.9614 788000 1.6862
1.5783 119.2633 790000 1.6782
1.5733 119.5652 792000 1.6882
1.5902 119.8671 794000 1.6618
1.5593 120.1691 796000 1.6754
1.5798 120.4710 798000 1.6485
1.5804 120.7729 800000 1.6612
1.5711 121.0749 802000 1.6736
1.5824 121.3768 804000 1.6485
1.5708 121.6787 806000 1.7122
1.5845 121.9807 808000 1.6782
1.5726 122.2826 810000 1.6529
1.5745 122.5845 812000 1.6642
1.5737 122.8865 814000 1.6799
1.5762 123.1884 816000 1.6781
1.5624 123.4903 818000 1.6519
1.5885 123.7923 820000 1.6421
1.5638 124.0942 822000 1.6808
1.5713 124.3961 824000 1.6555
1.575 124.6981 826000 1.6784
1.5777 125.0 828000 1.6660
1.5486 125.3019 830000 1.6627
1.5699 125.6039 832000 1.6684
1.5678 125.9058 834000 1.6641
1.5626 126.2077 836000 1.6540
1.5698 126.5097 838000 1.6888
1.5657 126.8116 840000 1.6671
1.5515 127.1135 842000 1.6626
1.5615 127.4155 844000 1.6619
1.5519 127.7174 846000 1.6418
1.5512 128.0193 848000 1.6517
1.553 128.3213 850000 1.6624
1.5591 128.6232 852000 1.6240
1.5557 128.9251 854000 1.6457
1.5498 129.2271 856000 1.6582
1.551 129.5290 858000 1.7006
1.5498 129.8309 860000 1.6468
1.5472 130.1329 862000 1.6475
1.5407 130.4348 864000 1.6636
1.5582 130.7367 866000 1.6428
1.5425 131.0386 868000 1.6629
1.5503 131.3406 870000 1.6468
1.5487 131.6425 872000 1.6437
1.5462 131.9444 874000 1.6341
1.5407 132.2464 876000 1.6468
1.5483 132.5483 878000 1.6314
1.5442 132.8502 880000 1.6566
1.5357 133.1522 882000 1.6030
1.5417 133.4541 884000 1.6376
1.5551 133.7560 886000 1.6453
1.5367 134.0580 888000 1.6328
1.5405 134.3599 890000 1.6168
1.5345 134.6618 892000 1.6500
1.5297 134.9638 894000 1.6443
1.5302 135.2657 896000 1.6323
1.5368 135.5676 898000 1.6168
1.5427 135.8696 900000 1.6427
1.5356 136.1715 902000 1.6620
1.5236 136.4734 904000 1.6473
1.5464 136.7754 906000 1.6714
1.5223 137.0773 908000 1.6426
1.5393 137.3792 910000 1.6560
1.5241 137.6812 912000 1.6336
1.5318 137.9831 914000 1.6644
1.5295 138.2850 916000 1.6389
1.5172 138.5870 918000 1.5886
1.5241 138.8889 920000 1.6481
1.5231 139.1908 922000 1.6292
1.5395 139.4928 924000 1.6223
1.5262 139.7947 926000 1.6326
1.5193 140.0966 928000 1.6342
1.5232 140.3986 930000 1.6521
1.5383 140.7005 932000 1.6273
1.5235 141.0024 934000 1.6245
1.5269 141.3043 936000 1.6475
1.5161 141.6063 938000 1.6347
1.5225 141.9082 940000 1.6498
1.5154 142.2101 942000 1.6434
1.5183 142.5121 944000 1.6133
1.5158 142.8140 946000 1.6326
1.5139 143.1159 948000 1.6313
1.5244 143.4179 950000 1.6222
1.5203 143.7198 952000 1.6305
1.5125 144.0217 954000 1.6615
1.5145 144.3237 956000 1.6525
1.5107 144.6256 958000 1.6304
1.5293 144.9275 960000 1.6328
1.5032 145.2295 962000 1.6157
1.5091 145.5314 964000 1.6260
1.5107 145.8333 966000 1.6280
1.5107 146.1353 968000 1.6145
1.5098 146.4372 970000 1.6354
1.5021 146.7391 972000 1.6149
1.5077 147.0411 974000 1.6371
1.5016 147.3430 976000 1.6344
1.5064 147.6449 978000 1.6283
1.5024 147.9469 980000 1.6276
1.5021 148.2488 982000 1.6432
1.5081 148.5507 984000 1.6402
1.5001 148.8527 986000 1.6240
1.4973 149.1546 988000 1.6236
1.5085 149.4565 990000 1.5935
1.5069 149.7585 992000 1.6430
1.4976 150.0604 994000 1.6185
1.5012 150.3623 996000 1.6391
1.4988 150.6643 998000 1.6323
1.5084 150.9662 1000000 1.6158

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
108
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.