ruRoberta-large_neg

This model is a fine-tuned version of ai-forever/ruRoberta-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.6173
Precision: 0.5980
Recall: 0.5920
F1: 0.5950
Accuracy: 0.9001

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0	50	0.6748	0.0	0.0	0.0	0.7758
No log	2.0	100	0.6015	0.0054	0.0019	0.0028	0.7853
No log	3.0	150	0.4397	0.0699	0.0867	0.0774	0.8296
No log	4.0	200	0.3701	0.1805	0.2351	0.2042	0.8555
No log	5.0	250	0.3134	0.3189	0.3680	0.3417	0.8823
No log	6.0	300	0.2931	0.3305	0.4528	0.3821	0.8921
No log	7.0	350	0.2891	0.4114	0.4297	0.4204	0.9017
No log	8.0	400	0.2799	0.4714	0.5087	0.4893	0.9033
No log	9.0	450	0.2671	0.5045	0.5453	0.5241	0.9118
0.3651	10.0	500	0.2917	0.5287	0.5145	0.5215	0.9149
0.3651	11.0	550	0.2900	0.4768	0.6127	0.5363	0.9105
0.3651	12.0	600	0.3307	0.4873	0.5896	0.5336	0.9135
0.3651	13.0	650	0.2883	0.5490	0.6050	0.5756	0.9163
0.3651	14.0	700	0.3514	0.5308	0.5819	0.5551	0.9170
0.3651	15.0	750	0.3858	0.5120	0.6590	0.5762	0.9055
0.3651	16.0	800	0.3655	0.5008	0.6262	0.5565	0.9204
0.3651	17.0	850	0.3605	0.5952	0.6628	0.6272	0.9206
0.3651	18.0	900	0.5156	0.5822	0.6416	0.6104	0.9148
0.3651	19.0	950	0.4462	0.4873	0.6628	0.5616	0.8964
0.0734	20.0	1000	0.3837	0.5817	0.5626	0.5720	0.9147
0.0734	21.0	1050	0.5484	0.6283	0.5472	0.5850	0.9122
0.0734	22.0	1100	0.4612	0.4459	0.6358	0.5242	0.8869
0.0734	23.0	1150	0.5106	0.588	0.5665	0.5770	0.9146
0.0734	24.0	1200	0.4511	0.6526	0.5973	0.6237	0.9187
0.0734	25.0	1250	0.4511	0.6152	0.6069	0.6111	0.9183
0.0734	26.0	1300	0.4642	0.6141	0.5703	0.5914	0.9141
0.0734	27.0	1350	0.4177	0.5191	0.6802	0.5888	0.9057
0.0734	28.0	1400	0.4025	0.6011	0.6532	0.6260	0.9210
0.0734	29.0	1450	0.4620	0.5519	0.6455	0.5950	0.9068
0.0435	30.0	1500	0.4229	0.6029	0.6320	0.6171	0.9205
0.0435	31.0	1550	0.3752	0.5565	0.6647	0.6058	0.9139
0.0435	32.0	1600	0.5814	0.6146	0.5684	0.5906	0.9131
0.0435	33.0	1650	0.4216	0.6155	0.5800	0.5972	0.9128
0.0435	34.0	1700	0.5093	0.5853	0.5819	0.5836	0.9147
0.0435	35.0	1750	0.4221	0.5968	0.6532	0.6237	0.9153
0.0435	36.0	1800	0.4700	0.6404	0.6416	0.6410	0.9179
0.0435	37.0	1850	0.3946	0.5651	0.5684	0.5668	0.9167
0.0435	38.0	1900	0.4196	0.6013	0.5549	0.5772	0.9062
0.0435	39.0	1950	0.4054	0.6282	0.5761	0.6010	0.9194
0.0447	40.0	2000	0.3649	0.6075	0.5934	0.6004	0.9133
0.0447	41.0	2050	0.4154	0.5907	0.6089	0.5996	0.9145

Framework versions

Transformers 4.38.2
Pytorch 2.1.2
Datasets 2.1.0
Tokenizers 0.15.2

DimasikKurd
/

ruRoberta-large_neg

ruRoberta-large_neg

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for DimasikKurd/ruRoberta-large_neg

Evaluation results