longformer_pos_neg

This model is a fine-tuned version of severinsimmler/xlm-roberta-longformer-base-16384 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.5549
Precision: 0.5599
Recall: 0.5786
F1: 0.5691
Accuracy: 0.9030

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.35	50	0.7729	0.0	0.0	0.0	0.7762
No log	2.7	100	0.5497	0.0220	0.0078	0.0115	0.8017
No log	4.05	150	0.4143	0.0706	0.0698	0.0702	0.8383
No log	5.41	200	0.3607	0.2329	0.2578	0.2447	0.8632
No log	6.76	250	0.3320	0.3628	0.3101	0.3344	0.8807
No log	8.11	300	0.3261	0.5108	0.4574	0.4826	0.8939
No log	9.46	350	0.3190	0.4229	0.5950	0.4944	0.8826
No log	10.81	400	0.2662	0.4821	0.6008	0.5349	0.9014
No log	12.16	450	0.2714	0.5901	0.5775	0.5837	0.9137
0.3792	13.51	500	0.2852	0.5769	0.5891	0.5829	0.9105
0.3792	14.86	550	0.3868	0.5876	0.5329	0.5589	0.9082
0.3792	16.22	600	0.3218	0.5444	0.6531	0.5938	0.9129
0.3792	17.57	650	0.3022	0.5645	0.6357	0.5980	0.9112
0.3792	18.92	700	0.3737	0.5419	0.6764	0.6017	0.9025
0.3792	20.27	750	0.3730	0.5411	0.6628	0.5958	0.9119
0.3792	21.62	800	0.4021	0.6145	0.6240	0.6192	0.9109
0.3792	22.97	850	0.3358	0.5159	0.6298	0.5672	0.9008
0.3792	24.32	900	0.3779	0.6065	0.6124	0.6095	0.9138
0.3792	25.68	950	0.4435	0.5293	0.6298	0.5752	0.9063
0.0755	27.03	1000	0.4230	0.6333	0.6124	0.6227	0.9169
0.0755	28.38	1050	0.3666	0.5911	0.6415	0.6152	0.9163
0.0755	29.73	1100	0.3335	0.6098	0.6512	0.6298	0.9178
0.0755	31.08	1150	0.4606	0.5725	0.6202	0.5953	0.9075
0.0755	32.43	1200	0.4280	0.5656	0.6434	0.6020	0.9065
0.0755	33.78	1250	0.4003	0.5833	0.6376	0.6093	0.9158
0.0755	35.14	1300	0.5802	0.6422	0.5775	0.6082	0.9020
0.0755	36.49	1350	0.4503	0.6014	0.6550	0.6271	0.9172
0.0755	37.84	1400	0.5614	0.6643	0.5523	0.6032	0.9044
0.0755	39.19	1450	0.5082	0.628	0.6085	0.6181	0.9119
0.0407	40.54	1500	0.3964	0.6072	0.6531	0.6293	0.9165
0.0407	41.89	1550	0.5447	0.4572	0.6938	0.5512	0.8799
0.0407	43.24	1600	0.5303	0.4816	0.6589	0.5565	0.8947
0.0407	44.59	1650	0.4461	0.6409	0.6260	0.6333	0.9138
0.0407	45.95	1700	0.6884	0.5561	0.4031	0.4674	0.8766
0.0407	47.3	1750	0.4556	0.5431	0.6105	0.5748	0.9097
0.0407	48.65	1800	0.4272	0.6771	0.5853	0.6279	0.9183
0.0407	50.0	1850	0.4904	0.5603	0.6570	0.6048	0.9015
0.0407	51.35	1900	0.4206	0.5655	0.6357	0.5985	0.9135

Framework versions

Transformers 4.38.2
Pytorch 2.1.2
Datasets 2.1.0
Tokenizers 0.15.2

DimasikKurd
/

longformer_pos_neg

longformer_pos_neg

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for DimasikKurd/longformer_pos_neg

Evaluation results