dit_base_binary_task

This model is a fine-tuned version of microsoft/dit-base on the davanstrien/leicester_loaded_annotations_binary dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0513
  • Accuracy: 0.9873
  • F1: 0.9600

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
No log 0.87 5 0.6816 0.5 0.2476
0.7387 1.87 10 0.5142 0.8354 0.0
0.7387 2.87 15 0.4690 0.8354 0.0
0.4219 3.87 20 0.5460 0.8354 0.0
0.4219 4.87 25 0.4703 0.8354 0.0
0.3734 5.87 30 0.4371 0.8354 0.0
0.3734 6.87 35 0.4147 0.8354 0.0
0.3261 7.87 40 0.4272 0.8354 0.0
0.3261 8.87 45 0.4038 0.8354 0.0
0.3078 9.87 50 0.3418 0.8354 0.0
0.3078 10.87 55 0.3042 0.8354 0.0
0.2501 11.87 60 0.2799 0.8354 0.0
0.2501 12.87 65 0.1419 0.9367 0.7619
0.1987 13.87 70 0.1224 0.9494 0.8182
0.1987 14.87 75 0.0749 0.9747 0.9167
0.1391 15.87 80 0.0539 0.9810 0.9412
0.1391 16.87 85 0.0830 0.9873 0.9600
0.1085 17.87 90 0.0443 0.9873 0.9600
0.1085 18.87 95 0.0258 0.9937 0.9804
0.1039 19.87 100 0.1025 0.9684 0.8936
0.1039 20.87 105 0.1597 0.9684 0.8936
0.1217 21.87 110 0.0278 0.9937 0.9811
0.1217 22.87 115 0.0458 0.9873 0.9600
0.0609 23.87 120 0.0478 0.9937 0.9804
0.0609 24.87 125 0.0671 0.9747 0.9231
0.1031 25.87 130 0.0751 0.9873 0.9600
0.1031 26.87 135 0.1963 0.9557 0.8444
0.0601 27.87 140 0.0870 0.9747 0.9167
0.0601 28.87 145 0.0890 0.9747 0.9167
0.0799 29.87 150 0.1017 0.9747 0.9167
0.0799 30.87 155 0.0041 1.0 1.0
0.0441 31.87 160 0.0332 0.9873 0.9615
0.0441 32.87 165 0.0839 0.9747 0.9167
0.0757 33.87 170 0.0722 0.9873 0.9600
0.0757 34.87 175 0.0168 0.9937 0.9804
0.0555 35.87 180 0.0443 0.9937 0.9804
0.0555 36.87 185 0.0227 0.9873 0.9615
0.0336 37.87 190 0.0128 0.9937 0.9804
0.0336 38.87 195 0.0169 0.9937 0.9811
0.0405 39.87 200 0.0193 0.9937 0.9804
0.0405 40.87 205 0.1216 0.9810 0.9388
0.0578 41.87 210 0.0307 0.9937 0.9804
0.0578 42.87 215 0.0539 0.9873 0.9600
0.0338 43.87 220 0.0573 0.9937 0.9804
0.0338 44.87 225 0.0086 1.0 1.0
0.0417 45.87 230 0.0491 0.9873 0.9600
0.0417 46.87 235 0.0089 1.0 1.0
0.0538 47.87 240 0.0846 0.9810 0.9388
0.0538 48.87 245 0.0452 0.9810 0.9388
0.0364 49.87 250 0.0513 0.9873 0.9600

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.12.1
  • Datasets 2.7.1
  • Tokenizers 0.13.1
Downloads last month
19
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.