longformer_pos

This model is a fine-tuned version of severinsimmler/xlm-roberta-longformer-base-16384 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6453
  • Precision: 0.5508
  • Recall: 0.5803
  • F1: 0.5651
  • Accuracy: 0.8941

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
No log 1.35 50 0.7424 0.0 0.0 0.0 0.7648
No log 2.7 100 0.4849 0.0415 0.0388 0.0401 0.8160
No log 4.05 150 0.3986 0.0902 0.1163 0.1016 0.8418
No log 5.41 200 0.3393 0.1827 0.1880 0.1853 0.8675
No log 6.76 250 0.3370 0.275 0.2132 0.2402 0.8788
No log 8.11 300 0.2937 0.3605 0.5310 0.4295 0.8864
No log 9.46 350 0.2793 0.4088 0.4302 0.4193 0.8997
No log 10.81 400 0.2500 0.4457 0.5969 0.5104 0.9066
No log 12.16 450 0.2894 0.5031 0.6221 0.5563 0.9107
0.3689 13.51 500 0.3678 0.5269 0.5116 0.5192 0.9036
0.3689 14.86 550 0.3156 0.5216 0.6085 0.5617 0.9100
0.3689 16.22 600 0.3824 0.5551 0.5756 0.5652 0.9115
0.3689 17.57 650 0.3347 0.4276 0.4981 0.4602 0.9075
0.3689 18.92 700 0.3705 0.4610 0.6880 0.5521 0.8920
0.3689 20.27 750 0.3276 0.5447 0.6492 0.5924 0.9100
0.3689 21.62 800 0.4603 0.5650 0.5562 0.5605 0.9107
0.3689 22.97 850 0.3142 0.5677 0.6260 0.5954 0.9177
0.3689 24.32 900 0.3887 0.5747 0.6260 0.5993 0.9164
0.3689 25.68 950 0.5906 0.4670 0.6860 0.5557 0.8789
0.0798 27.03 1000 0.5407 0.6218 0.5736 0.5968 0.8989
0.0798 28.38 1050 0.4645 0.5044 0.5504 0.5264 0.9051
0.0798 29.73 1100 0.3217 0.5107 0.6027 0.5529 0.9104
0.0798 31.08 1150 0.4471 0.5523 0.6647 0.6033 0.9055
0.0798 32.43 1200 0.4611 0.5029 0.6725 0.5755 0.8980
0.0798 33.78 1250 0.4495 0.5783 0.6085 0.5930 0.9155
0.0798 35.14 1300 0.5293 0.5727 0.6105 0.5910 0.9128
0.0798 36.49 1350 0.4453 0.5652 0.5795 0.5722 0.9100
0.0798 37.84 1400 0.3912 0.5988 0.5988 0.5988 0.9162
0.0798 39.19 1450 0.3862 0.5917 0.6066 0.5990 0.9182
0.0393 40.54 1500 0.4303 0.5337 0.6744 0.5959 0.9137
0.0393 41.89 1550 0.3846 0.5129 0.6550 0.5753 0.9119
0.0393 43.24 1600 0.5571 0.5735 0.6047 0.5887 0.9124
0.0393 44.59 1650 0.4528 0.5719 0.6395 0.6038 0.9182
0.0393 45.95 1700 0.5202 0.6037 0.6260 0.6147 0.9130
0.0393 47.3 1750 0.5163 0.5743 0.5019 0.5357 0.8990
0.0393 48.65 1800 0.3528 0.5771 0.6531 0.6127 0.9157
0.0393 50.0 1850 0.4441 0.5654 0.6531 0.6061 0.9155
0.0393 51.35 1900 0.4517 0.6262 0.6105 0.6183 0.9151
0.0393 52.7 1950 0.4142 0.5812 0.6105 0.5955 0.9142
0.0315 54.05 2000 0.4539 0.5694 0.6357 0.6007 0.9180
0.0315 55.41 2050 0.4912 0.4107 0.5795 0.4807 0.9097
0.0315 56.76 2100 0.4442 0.5514 0.5194 0.5349 0.9190
0.0315 58.11 2150 0.4871 0.5414 0.6337 0.5839 0.9074
0.0315 59.46 2200 0.6469 0.5937 0.5465 0.5691 0.9072
0.0315 60.81 2250 0.4975 0.6346 0.6395 0.6371 0.9167
0.0315 62.16 2300 0.4800 0.6060 0.6260 0.6158 0.9151
0.0315 63.51 2350 0.5273 0.6047 0.5988 0.6018 0.9137
0.0315 64.86 2400 0.4613 0.5794 0.6221 0.6 0.9145
0.0315 66.22 2450 0.4839 0.5996 0.6298 0.6144 0.9189
0.0287 67.57 2500 0.4725 0.4970 0.6415 0.5601 0.9020
0.0287 68.92 2550 0.5888 0.6614 0.5717 0.6133 0.8999
0.0287 70.27 2600 0.4525 0.6021 0.5601 0.5803 0.9086
0.0287 71.62 2650 0.4416 0.5743 0.6066 0.5900 0.9157
0.0287 72.97 2700 0.4290 0.5084 0.6473 0.5695 0.8974
0.0287 74.32 2750 0.5249 0.5778 0.5543 0.5658 0.9103
0.0287 75.68 2800 0.5481 0.6149 0.5601 0.5862 0.9042

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.2
Downloads last month
15
Safetensors
Model size
311M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for DimasikKurd/longformer_pos

Finetuned
(2)
this model