twitter_trainer

This model is a fine-tuned version of bert-base-cased on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 7

Training Loss	Epoch	Step	Validation Loss	Accuracy	P	R	F1
3.6245	1.0	597	0.4451	84.1709	99.8149	103.1569	101.4584
1.8323	2.0	1194	0.3794	86.0972	102.3665	100.0625	101.2014
1.233	3.0	1791	0.3715	87.5209	100.9234	102.3408	101.6272
0.9132	4.0	2388	0.5171	87.1022	102.4483	100.4991	101.4643
0.6928	5.0	2985	0.6683	86.9347	102.6526	100.5006	101.5652
0.4037	6.0	3582	0.7477	87.3534	101.8838	101.3746	101.6286
0.3334	6.9891	4172	0.7924	86.8509	102.7555	100.3442	101.5355