deberta-base-en-wiki

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1310

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1250.0
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
6.4739 0.0504 1250 6.4351
3.4566 0.1009 2500 3.2703
2.7823 0.1513 3750 2.6563
2.5242 0.2018 5000 2.4202
2.3816 0.2522 6250 2.2763
2.2723 0.3027 7500 2.1715
2.182 0.3531 8750 2.0933
2.1325 0.4035 10000 2.0317
2.0584 0.4540 11250 1.9770
2.0333 0.5044 12500 1.9288
1.9898 0.5549 13750 1.8972
1.9574 0.6053 15000 1.8557
1.9184 0.6558 16250 1.8324
1.8899 0.7062 17500 1.8049
1.8909 0.7567 18750 1.7788
1.8371 0.8071 20000 1.7558
1.8343 0.8575 21250 1.7374
1.8341 0.9080 22500 1.7256
1.7976 0.9584 23750 1.7011
1.7777 1.0089 25000 1.6865
1.7523 1.0593 26250 1.6715
1.7476 1.1098 27500 1.6581
1.7291 1.1602 28750 1.6432
1.7108 1.2106 30000 1.6333
1.7195 1.2611 31250 1.6198
1.6969 1.3115 32500 1.6109
1.6927 1.3620 33750 1.5965
1.6818 1.4124 35000 1.5917
1.6647 1.4629 36250 1.5827
1.6635 1.5133 37500 1.5704
1.6561 1.5637 38750 1.5593
1.6404 1.6142 40000 1.5527
1.627 1.6646 41250 1.5470
1.6292 1.7151 42500 1.5391
1.6111 1.7655 43750 1.5288
1.6154 1.8160 45000 1.5217
1.5993 1.8664 46250 1.5191
1.6028 1.9168 47500 1.5077
1.5861 1.9673 48750 1.5019
1.5793 2.0177 50000 1.4954
1.5664 2.0682 51250 1.4887
1.5723 2.1186 52500 1.4839
1.5715 2.1691 53750 1.4786
1.5612 2.2195 55000 1.4757
1.5499 2.2700 56250 1.4648
1.5542 2.3204 57500 1.4632
1.5531 2.3708 58750 1.4558
1.5329 2.4213 60000 1.4507
1.5481 2.4717 61250 1.4472
1.5336 2.5222 62500 1.4431
1.526 2.5726 63750 1.4405
1.518 2.6231 65000 1.4345
1.5135 2.6735 66250 1.4264
1.4987 2.7239 67500 1.4226
1.5007 2.7744 68750 1.4176
1.4921 2.8248 70000 1.4179
1.5031 2.8753 71250 1.4146
1.4848 2.9257 72500 1.4098
1.4702 2.9762 73750 1.4023
1.4861 3.0266 75000 1.4010
1.487 3.0770 76250 1.3963
1.4736 3.1275 77500 1.3923
1.4751 3.1779 78750 1.3879
1.4783 3.2284 80000 1.3858
1.4843 3.2788 81250 1.3795
1.4722 3.3293 82500 1.3771
1.4551 3.3797 83750 1.3754
1.4539 3.4302 85000 1.3729
1.4723 3.4806 86250 1.3646
1.4493 3.5310 87500 1.3658
1.4455 3.5815 88750 1.3610
1.4442 3.6319 90000 1.3573
1.4457 3.6824 91250 1.3540
1.4259 3.7328 92500 1.3534
1.4355 3.7833 93750 1.3470
1.4184 3.8337 95000 1.3435
1.4437 3.8841 96250 1.3416
1.4255 3.9346 97500 1.3377
1.4115 3.9850 98750 1.3358
1.4196 4.0355 100000 1.3351
1.4159 4.0859 101250 1.3292
1.4227 4.1364 102500 1.3302
1.4122 4.1868 103750 1.3270
1.3996 4.2372 105000 1.3207
1.4041 4.2877 106250 1.3210
1.3956 4.3381 107500 1.3187
1.392 4.3886 108750 1.3170
1.3943 4.4390 110000 1.3125
1.4143 4.4895 111250 1.3095
1.3939 4.5399 112500 1.3063
1.3802 4.5903 113750 1.3067
1.3908 4.6408 115000 1.3020
1.3841 4.6912 116250 1.3025
1.3821 4.7417 117500 1.3007
1.3774 4.7921 118750 1.2989
1.3807 4.8426 120000 1.2907
1.3643 4.8930 121250 1.2946
1.3704 4.9435 122500 1.2920
1.3685 4.9939 123750 1.2868
1.3794 5.0443 125000 1.2812
1.3646 5.0948 126250 1.2809
1.356 5.1452 127500 1.2803
1.3696 5.1957 128750 1.2784
1.3544 5.2461 130000 1.2741
1.3618 5.2966 131250 1.2736
1.3471 5.3470 132500 1.2695
1.3444 5.3974 133750 1.2648
1.3524 5.4479 135000 1.2658
1.354 5.4983 136250 1.2643
1.3438 5.5488 137500 1.2639
1.357 5.5992 138750 1.2599
1.3473 5.6497 140000 1.2617
1.3309 5.7001 141250 1.2568
1.3328 5.7505 142500 1.2511
1.3236 5.8010 143750 1.2511
1.3276 5.8514 145000 1.2507
1.3288 5.9019 146250 1.2466
1.3238 5.9523 147500 1.2456
1.3327 6.0028 148750 1.2484
1.3329 6.0532 150000 1.2424
1.3328 6.1037 151250 1.2361
1.307 6.1541 152500 1.2407
1.3285 6.2045 153750 1.2374
1.3097 6.2550 155000 1.2339
1.3115 6.3054 156250 1.2354
1.304 6.3559 157500 1.2294
1.3132 6.4063 158750 1.2290
1.303 6.4568 160000 1.2276
1.3029 6.5072 161250 1.2270
1.3048 6.5576 162500 1.2229
1.3085 6.6081 163750 1.2226
1.2887 6.6585 165000 1.2209
1.3055 6.7090 166250 1.2206
1.2902 6.7594 167500 1.2178
1.2892 6.8099 168750 1.2149
1.3049 6.8603 170000 1.2125
1.2935 6.9107 171250 1.2115
1.2888 6.9612 172500 1.2091
1.2856 7.0116 173750 1.2082
1.2762 7.0621 175000 1.2085
1.2883 7.1125 176250 1.2055
1.2906 7.1630 177500 1.2019
1.2831 7.2134 178750 1.2047
1.2654 7.2638 180000 1.1995
1.2759 7.3143 181250 1.1994
1.276 7.3647 182500 1.1992
1.2692 7.4152 183750 1.1974
1.2791 7.4656 185000 1.1940
1.2697 7.5161 186250 1.1930
1.2635 7.5665 187500 1.1889
1.2656 7.6170 188750 1.1926
1.2615 7.6675 190000 1.1828
1.2641 7.7179 191250 1.1852
1.2578 7.7684 192500 1.1791
1.2647 7.8188 193750 1.1782
1.2644 7.8692 195000 1.1777
1.2638 7.9197 196250 1.1752
1.2528 7.9701 197500 1.1748
1.2554 8.0206 198750 1.1746
1.2548 8.0710 200000 1.1726
1.2546 8.1215 201250 1.1698
1.247 8.1719 202500 1.1689
1.2478 8.2223 203750 1.1698
1.2578 8.2728 205000 1.1650
1.2527 8.3232 206250 1.1650
1.2612 8.3737 207500 1.1639
1.2339 8.4241 208750 1.1635
1.2422 8.4746 210000 1.1633
1.2311 8.5250 211250 1.1617
1.2552 8.5754 212500 1.1585
1.2383 8.6259 213750 1.1561
1.2406 8.6763 215000 1.1555
1.2329 8.7268 216250 1.1551
1.2392 8.7772 217500 1.1552
1.2301 8.8277 218750 1.1536
1.2262 8.8781 220000 1.1483
1.2284 8.9286 221250 1.1509
1.2259 8.9790 222500 1.1529
1.2204 9.0294 223750 1.1474
1.237 9.0799 225000 1.1471
1.2432 9.1303 226250 1.1439
1.2145 9.1808 227500 1.1473
1.2132 9.2312 228750 1.1428
1.2178 9.2817 230000 1.1426
1.2138 9.3321 231250 1.1416
1.2204 9.3825 232500 1.1422
1.2233 9.4330 233750 1.1402
1.2048 9.4834 235000 1.1370
1.2203 9.5339 236250 1.1389
1.2156 9.5843 237500 1.1375
1.2131 9.6348 238750 1.1367
1.2215 9.6852 240000 1.1387
1.2152 9.7356 241250 1.1347
1.2179 9.7861 242500 1.1321
1.2166 9.8365 243750 1.1359
1.2171 9.8870 245000 1.1343
1.208 9.9374 246250 1.1321
1.2105 9.9879 247500 1.1332

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
6
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.