ruRoberta-large_neg

This model is a fine-tuned version of ai-forever/ruRoberta-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6173
  • Precision: 0.5980
  • Recall: 0.5920
  • F1: 0.5950
  • Accuracy: 0.9001

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
No log 1.0 50 0.6748 0.0 0.0 0.0 0.7758
No log 2.0 100 0.6015 0.0054 0.0019 0.0028 0.7853
No log 3.0 150 0.4397 0.0699 0.0867 0.0774 0.8296
No log 4.0 200 0.3701 0.1805 0.2351 0.2042 0.8555
No log 5.0 250 0.3134 0.3189 0.3680 0.3417 0.8823
No log 6.0 300 0.2931 0.3305 0.4528 0.3821 0.8921
No log 7.0 350 0.2891 0.4114 0.4297 0.4204 0.9017
No log 8.0 400 0.2799 0.4714 0.5087 0.4893 0.9033
No log 9.0 450 0.2671 0.5045 0.5453 0.5241 0.9118
0.3651 10.0 500 0.2917 0.5287 0.5145 0.5215 0.9149
0.3651 11.0 550 0.2900 0.4768 0.6127 0.5363 0.9105
0.3651 12.0 600 0.3307 0.4873 0.5896 0.5336 0.9135
0.3651 13.0 650 0.2883 0.5490 0.6050 0.5756 0.9163
0.3651 14.0 700 0.3514 0.5308 0.5819 0.5551 0.9170
0.3651 15.0 750 0.3858 0.5120 0.6590 0.5762 0.9055
0.3651 16.0 800 0.3655 0.5008 0.6262 0.5565 0.9204
0.3651 17.0 850 0.3605 0.5952 0.6628 0.6272 0.9206
0.3651 18.0 900 0.5156 0.5822 0.6416 0.6104 0.9148
0.3651 19.0 950 0.4462 0.4873 0.6628 0.5616 0.8964
0.0734 20.0 1000 0.3837 0.5817 0.5626 0.5720 0.9147
0.0734 21.0 1050 0.5484 0.6283 0.5472 0.5850 0.9122
0.0734 22.0 1100 0.4612 0.4459 0.6358 0.5242 0.8869
0.0734 23.0 1150 0.5106 0.588 0.5665 0.5770 0.9146
0.0734 24.0 1200 0.4511 0.6526 0.5973 0.6237 0.9187
0.0734 25.0 1250 0.4511 0.6152 0.6069 0.6111 0.9183
0.0734 26.0 1300 0.4642 0.6141 0.5703 0.5914 0.9141
0.0734 27.0 1350 0.4177 0.5191 0.6802 0.5888 0.9057
0.0734 28.0 1400 0.4025 0.6011 0.6532 0.6260 0.9210
0.0734 29.0 1450 0.4620 0.5519 0.6455 0.5950 0.9068
0.0435 30.0 1500 0.4229 0.6029 0.6320 0.6171 0.9205
0.0435 31.0 1550 0.3752 0.5565 0.6647 0.6058 0.9139
0.0435 32.0 1600 0.5814 0.6146 0.5684 0.5906 0.9131
0.0435 33.0 1650 0.4216 0.6155 0.5800 0.5972 0.9128
0.0435 34.0 1700 0.5093 0.5853 0.5819 0.5836 0.9147
0.0435 35.0 1750 0.4221 0.5968 0.6532 0.6237 0.9153
0.0435 36.0 1800 0.4700 0.6404 0.6416 0.6410 0.9179
0.0435 37.0 1850 0.3946 0.5651 0.5684 0.5668 0.9167
0.0435 38.0 1900 0.4196 0.6013 0.5549 0.5772 0.9062
0.0435 39.0 1950 0.4054 0.6282 0.5761 0.6010 0.9194
0.0447 40.0 2000 0.3649 0.6075 0.5934 0.6004 0.9133
0.0447 41.0 2050 0.4154 0.5907 0.6089 0.5996 0.9145

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.2
Downloads last month
17
Safetensors
Model size
354M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for DimasikKurd/ruRoberta-large_neg

Finetuned
(13)
this model