dit-base_tobacco_small_student

This model is a fine-tuned version of microsoft/dit-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.3305
  • Accuracy: 0.435
  • Brier Loss: 1.0472
  • Nll: 10.3327
  • F1 Micro: 0.435
  • F1 Macro: 0.4299
  • Ece: 0.5115
  • Aurc: 0.4245

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Accuracy Brier Loss Nll F1 Micro F1 Macro Ece Aurc
No log 1.0 50 2.1780 0.16 0.8745 11.2696 0.16 0.0323 0.2326 0.8208
No log 2.0 100 2.1761 0.19 0.8727 10.5065 0.19 0.0548 0.2712 0.7980
No log 3.0 150 2.1426 0.16 0.8689 8.8915 0.16 0.0451 0.2697 0.6322
No log 4.0 200 2.0668 0.225 0.8434 9.6036 0.225 0.1219 0.2680 0.6623
No log 5.0 250 2.0633 0.21 0.8447 5.7679 0.2100 0.1401 0.2733 0.5765
No log 6.0 300 2.0030 0.22 0.8351 7.1501 0.22 0.1132 0.3000 0.6750
No log 7.0 350 1.9273 0.32 0.8243 6.2911 0.32 0.2612 0.2822 0.6549
No log 8.0 400 1.7954 0.365 0.7742 4.2641 0.3650 0.2647 0.2630 0.5031
No log 9.0 450 1.8070 0.36 0.7720 4.9274 0.36 0.2914 0.2601 0.4871
1.9795 10.0 500 1.7838 0.34 0.7857 3.3860 0.34 0.2387 0.2902 0.5057
1.9795 11.0 550 1.7214 0.395 0.7404 4.1630 0.395 0.2995 0.2922 0.4210
1.9795 12.0 600 1.6834 0.445 0.7284 3.7081 0.445 0.3444 0.2700 0.3914
1.9795 13.0 650 1.6992 0.38 0.7641 4.1246 0.38 0.3045 0.3375 0.4155
1.9795 14.0 700 1.8695 0.395 0.7711 5.6899 0.395 0.3432 0.3224 0.4425
1.9795 15.0 750 1.8757 0.38 0.7939 5.1099 0.38 0.3879 0.3102 0.4313
1.9795 16.0 800 2.0457 0.405 0.8184 5.6034 0.405 0.3957 0.3256 0.4414
1.9795 17.0 850 2.2243 0.395 0.8653 7.7124 0.395 0.3567 0.3887 0.3997
1.9795 18.0 900 1.9309 0.45 0.7794 5.2698 0.45 0.3763 0.3626 0.3767
1.9795 19.0 950 2.2285 0.415 0.8319 6.7127 0.415 0.4153 0.3667 0.3942
0.6717 20.0 1000 2.3745 0.445 0.8643 7.4432 0.445 0.4290 0.3859 0.4046
0.6717 21.0 1050 2.5389 0.41 0.9148 7.6865 0.41 0.3994 0.4351 0.4054
0.6717 22.0 1100 2.5537 0.465 0.8500 8.1266 0.465 0.4623 0.4070 0.3900
0.6717 23.0 1150 2.8355 0.42 0.9426 8.8542 0.4200 0.3930 0.4508 0.4201
0.6717 24.0 1200 2.8575 0.4 0.9962 7.6428 0.4000 0.3502 0.4994 0.4119
0.6717 25.0 1250 2.8704 0.445 0.9418 9.2600 0.445 0.4570 0.4309 0.4021
0.6717 26.0 1300 3.4702 0.43 0.9641 12.1621 0.4300 0.3977 0.4590 0.3597
0.6717 27.0 1350 3.1484 0.475 0.9518 8.1474 0.4750 0.4641 0.4809 0.4088
0.6717 28.0 1400 3.2299 0.455 0.9673 9.6161 0.455 0.4205 0.4711 0.3806
0.6717 29.0 1450 3.4968 0.425 1.0136 10.5614 0.425 0.3992 0.4743 0.3773
0.0268 30.0 1500 3.1340 0.46 0.9443 8.5023 0.46 0.4296 0.4557 0.3735
0.0268 31.0 1550 3.4297 0.435 1.0058 8.2428 0.435 0.3979 0.4967 0.3848
0.0268 32.0 1600 3.6922 0.4 1.0488 10.8019 0.4000 0.3880 0.5223 0.4017
0.0268 33.0 1650 3.6009 0.445 0.9964 10.1007 0.445 0.4204 0.4924 0.3981
0.0268 34.0 1700 3.6678 0.425 1.0494 9.1369 0.425 0.3896 0.5159 0.4192
0.0268 35.0 1750 3.5743 0.45 0.9953 9.5996 0.45 0.4182 0.4934 0.4030
0.0268 36.0 1800 3.5551 0.465 0.9877 9.6080 0.465 0.4221 0.5033 0.3977
0.0268 37.0 1850 3.7424 0.435 1.0191 9.9258 0.435 0.4292 0.4955 0.4120
0.0268 38.0 1900 3.7093 0.45 1.0051 9.7038 0.45 0.4033 0.4966 0.3857
0.0268 39.0 1950 3.7240 0.45 1.0076 9.8462 0.45 0.4027 0.4953 0.3962
0.0022 40.0 2000 3.7503 0.455 1.0090 9.9967 0.455 0.4076 0.5056 0.3968
0.0022 41.0 2050 3.5545 0.44 1.0007 8.7616 0.44 0.4285 0.4894 0.4008
0.0022 42.0 2100 3.7452 0.435 1.0142 9.4376 0.435 0.4135 0.5032 0.4022
0.0022 43.0 2150 3.5980 0.47 0.9532 8.2333 0.47 0.4441 0.4650 0.4113
0.0022 44.0 2200 3.7055 0.45 0.9946 9.0121 0.45 0.4327 0.4817 0.3688
0.0022 45.0 2250 3.8500 0.435 1.0161 9.2035 0.435 0.4164 0.5128 0.3723
0.0022 46.0 2300 3.8806 0.435 1.0261 10.7033 0.435 0.4323 0.5008 0.3812
0.0022 47.0 2350 3.8114 0.455 1.0128 9.6784 0.455 0.4236 0.5025 0.3873
0.0022 48.0 2400 3.8743 0.435 1.0294 8.7193 0.435 0.3734 0.5109 0.3783
0.0022 49.0 2450 3.9281 0.43 1.0414 9.9489 0.4300 0.4296 0.5047 0.4049
0.0047 50.0 2500 3.7824 0.45 0.9956 10.7814 0.45 0.4467 0.4975 0.3949
0.0047 51.0 2550 4.0089 0.475 0.9668 11.9005 0.4750 0.4253 0.4637 0.4501
0.0047 52.0 2600 3.7248 0.43 0.9909 10.6449 0.4300 0.4064 0.4750 0.4023
0.0047 53.0 2650 3.7911 0.415 1.0491 9.1188 0.415 0.3608 0.5130 0.4173
0.0047 54.0 2700 3.6925 0.44 1.0000 8.9655 0.44 0.3970 0.4826 0.4168
0.0047 55.0 2750 3.6214 0.46 0.9590 9.5422 0.46 0.4440 0.4636 0.3829
0.0047 56.0 2800 4.3545 0.405 1.0811 10.6531 0.405 0.4090 0.5439 0.4533
0.0047 57.0 2850 3.6835 0.46 0.9717 8.2408 0.46 0.4367 0.4950 0.4118
0.0047 58.0 2900 4.0080 0.465 1.0011 9.3764 0.465 0.4579 0.4927 0.4234
0.0047 59.0 2950 4.0141 0.45 1.0014 9.7100 0.45 0.4443 0.4987 0.4220
0.0118 60.0 3000 3.7963 0.43 1.0135 9.4040 0.4300 0.4007 0.5007 0.3979
0.0118 61.0 3050 4.0609 0.43 1.0426 9.3533 0.4300 0.3825 0.5266 0.4285
0.0118 62.0 3100 4.0150 0.47 1.0002 9.3307 0.47 0.4490 0.5030 0.4052
0.0118 63.0 3150 3.7982 0.47 0.9660 8.5060 0.47 0.4581 0.4716 0.3988
0.0118 64.0 3200 4.3553 0.44 1.0428 10.3840 0.44 0.4218 0.5163 0.4312
0.0118 65.0 3250 3.7142 0.44 0.9900 8.5049 0.44 0.4298 0.4849 0.3735
0.0118 66.0 3300 3.7411 0.47 0.9661 8.1935 0.47 0.4497 0.4789 0.3812
0.0118 67.0 3350 3.7858 0.49 0.9574 8.8397 0.49 0.4799 0.4616 0.3895
0.0118 68.0 3400 3.7927 0.495 0.9459 8.6915 0.495 0.4870 0.4577 0.3883
0.0118 69.0 3450 3.8348 0.5 0.9454 8.8298 0.5 0.4889 0.4715 0.3891
0.0004 70.0 3500 3.8551 0.48 0.9500 8.9827 0.48 0.4755 0.4691 0.3913
0.0004 71.0 3550 3.8432 0.48 0.9622 9.1404 0.48 0.4691 0.4690 0.3885
0.0004 72.0 3600 3.8594 0.48 0.9617 8.8182 0.48 0.4691 0.4805 0.3854
0.0004 73.0 3650 3.8855 0.485 0.9622 8.8248 0.485 0.4760 0.4809 0.3881
0.0004 74.0 3700 3.8996 0.49 0.9610 8.9750 0.49 0.4818 0.4634 0.3892
0.0004 75.0 3750 3.9921 0.475 0.9642 9.5409 0.4750 0.4597 0.4666 0.4185
0.0004 76.0 3800 4.1128 0.43 1.0429 9.9966 0.4300 0.3844 0.5187 0.4056
0.0004 77.0 3850 4.0783 0.44 1.0172 9.3016 0.44 0.4205 0.5051 0.3988
0.0004 78.0 3900 4.0804 0.445 1.0254 8.9753 0.445 0.4246 0.5089 0.3982
0.0004 79.0 3950 4.0892 0.445 1.0269 8.8290 0.445 0.4246 0.5069 0.4000
0.0002 80.0 4000 4.1013 0.445 1.0258 9.1363 0.445 0.4246 0.5129 0.4033
0.0002 81.0 4050 4.0985 0.44 1.0287 9.1459 0.44 0.4213 0.5074 0.4054
0.0002 82.0 4100 4.1029 0.44 1.0263 9.3107 0.44 0.4211 0.5125 0.4066
0.0002 83.0 4150 4.1075 0.44 1.0248 9.4604 0.44 0.4224 0.5164 0.4061
0.0002 84.0 4200 4.1087 0.44 1.0225 9.7739 0.44 0.4221 0.5090 0.4055
0.0002 85.0 4250 4.1248 0.44 1.0262 9.7747 0.44 0.4259 0.5032 0.4065
0.0002 86.0 4300 4.1527 0.445 1.0263 9.4647 0.445 0.4299 0.5128 0.4066
0.0002 87.0 4350 4.0529 0.475 0.9810 9.1439 0.4750 0.4488 0.4910 0.3938
0.0002 88.0 4400 4.1405 0.455 1.0091 9.5149 0.455 0.4230 0.4966 0.4147
0.0002 89.0 4450 4.3483 0.41 1.0724 9.8421 0.41 0.4083 0.5384 0.4090
0.0008 90.0 4500 4.5574 0.39 1.1077 11.2517 0.39 0.3940 0.5618 0.4405
0.0008 91.0 4550 4.5104 0.41 1.0890 10.8687 0.41 0.4173 0.5411 0.4350
0.0008 92.0 4600 4.3791 0.425 1.0672 10.7198 0.425 0.4202 0.5233 0.4306
0.0008 93.0 4650 4.3608 0.43 1.0553 10.8428 0.4300 0.4236 0.5196 0.4284
0.0008 94.0 4700 4.3469 0.44 1.0474 10.6774 0.44 0.4428 0.5020 0.4280
0.0008 95.0 4750 4.3420 0.44 1.0487 10.5138 0.44 0.4385 0.5260 0.4270
0.0008 96.0 4800 4.3385 0.435 1.0491 10.3448 0.435 0.4312 0.5170 0.4266
0.0008 97.0 4850 4.3341 0.435 1.0485 10.3378 0.435 0.4312 0.5136 0.4261
0.0008 98.0 4900 4.3336 0.435 1.0480 10.3350 0.435 0.4312 0.5184 0.4253
0.0008 99.0 4950 4.3306 0.435 1.0472 10.3328 0.435 0.4299 0.5116 0.4245
0.0001 100.0 5000 4.3305 0.435 1.0472 10.3327 0.435 0.4299 0.5115 0.4245

Framework versions

  • Transformers 4.28.0.dev0
  • Pytorch 1.12.1+cu113
  • Datasets 2.12.0
  • Tokenizers 0.12.1
Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.