deberta_large

This model is a fine-tuned version of microsoft/deberta-v3-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6965

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.471 0.0173 20 0.2712
1.4977 0.0346 40 0.7702
0.7066 0.0519 60 0.4904
0.6817 0.0692 80 1.0408
0.897 0.0865 100 1.7531
0.6793 0.1039 120 0.8040
0.7946 0.1212 140 0.6972
0.7391 0.1385 160 0.6933
0.7441 0.1558 180 0.7431
0.656 0.1731 200 0.8636
0.5681 0.1904 220 0.7142
0.9031 0.2077 240 0.7183
0.5978 0.2250 260 0.7371
0.6308 0.2423 280 0.7036
0.6829 0.2596 300 0.6932
0.6472 0.2769 320 0.7010
0.6331 0.2942 340 0.7316
0.7698 0.3116 360 0.6938
0.6704 0.3289 380 0.7128
0.7532 0.3462 400 0.6992
0.703 0.3635 420 0.6932
0.7747 0.3808 440 0.7437
0.7063 0.3981 460 0.6931
0.6013 0.4154 480 0.7127
0.768 0.4327 500 0.6955
0.682 0.4500 520 0.6948
0.7419 0.4673 540 0.6976
0.6928 0.4846 560 0.6963
0.6246 0.5019 580 0.6995
0.6742 0.5193 600 0.6965
0.6419 0.5366 620 0.6947
0.7268 0.5539 640 0.6996
0.7054 0.5712 660 0.6938
0.6931 0.5885 680 0.7010
0.7344 0.6058 700 0.6975
0.7151 0.6231 720 0.6950
0.7106 0.6404 740 0.7027
0.6829 0.6577 760 0.6941
0.6883 0.6750 780 0.7005
0.6268 0.6923 800 0.6978
0.6674 0.7096 820 0.6979
0.6704 0.7270 840 0.7024
0.7527 0.7443 860 0.7010
0.7239 0.7616 880 0.7060
0.7611 0.7789 900 0.7033
0.7522 0.7962 920 0.7010
0.773 0.8135 940 0.6992
0.6564 0.8308 960 0.6959
0.7369 0.8481 980 0.6970
0.7119 0.8654 1000 0.6971
0.7181 0.8827 1020 0.6971
0.7011 0.9000 1040 0.6975
0.7414 0.9174 1060 0.6968
0.7732 0.9347 1080 0.6968
0.6499 0.9520 1100 0.6965
0.6681 0.9693 1120 0.6965
0.7831 0.9866 1140 0.6965

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.19.1
Downloads last month
32
Safetensors
Model size
435M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for jhonalevc1995/deberta_large

Finetuned
(123)
this model