bert-large-uncased-sst-2-32-13-smoothed

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6595
Accuracy: 0.75

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 75
label_smoothing_factor: 0.45

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	2	0.8178	0.5156
No log	2.0	4	0.8133	0.5156
No log	3.0	6	0.8065	0.5156
No log	4.0	8	0.7961	0.5156
0.8123	5.0	10	0.7821	0.5156
0.8123	6.0	12	0.7655	0.5
0.8123	7.0	14	0.7460	0.5
0.8123	8.0	16	0.7247	0.5
0.8123	9.0	18	0.7034	0.5312
0.751	10.0	20	0.6892	0.5938
0.751	11.0	22	0.6808	0.6094
0.751	12.0	24	0.6761	0.6719
0.751	13.0	26	0.6715	0.75
0.751	14.0	28	0.6665	0.7812
0.6479	15.0	30	0.6624	0.75
0.6479	16.0	32	0.6615	0.7344
0.6479	17.0	34	0.6572	0.7344
0.6479	18.0	36	0.6529	0.7656
0.6479	19.0	38	0.6503	0.7969
0.5876	20.0	40	0.6499	0.7812
0.5876	21.0	42	0.6496	0.7656
0.5876	22.0	44	0.6502	0.7344
0.5876	23.0	46	0.6536	0.75
0.5876	24.0	48	0.6593	0.7344
0.5439	25.0	50	0.6605	0.7344
0.5439	26.0	52	0.6592	0.7344
0.5439	27.0	54	0.6578	0.75
0.5439	28.0	56	0.6575	0.75
0.5439	29.0	58	0.6571	0.7344
0.5429	30.0	60	0.6575	0.75
0.5429	31.0	62	0.6635	0.75
0.5429	32.0	64	0.6681	0.7344
0.5429	33.0	66	0.6705	0.7188
0.5429	34.0	68	0.6701	0.6875
0.5404	35.0	70	0.6664	0.7188
0.5404	36.0	72	0.6621	0.7344
0.5404	37.0	74	0.6599	0.7344
0.5404	38.0	76	0.6604	0.7344
0.5404	39.0	78	0.6637	0.7344
0.5403	40.0	80	0.6647	0.7344
0.5403	41.0	82	0.6641	0.7344
0.5403	42.0	84	0.6633	0.7344
0.5403	43.0	86	0.6663	0.7344
0.5403	44.0	88	0.6699	0.7344
0.5406	45.0	90	0.6684	0.7344
0.5406	46.0	92	0.6625	0.7344
0.5406	47.0	94	0.6582	0.75
0.5406	48.0	96	0.6549	0.75
0.5406	49.0	98	0.6523	0.7656
0.54	50.0	100	0.6523	0.75
0.54	51.0	102	0.6525	0.75
0.54	52.0	104	0.6531	0.75
0.54	53.0	106	0.6534	0.75
0.54	54.0	108	0.6539	0.75
0.5396	55.0	110	0.6553	0.7656
0.5396	56.0	112	0.6540	0.75
0.5396	57.0	114	0.6555	0.7656
0.5396	58.0	116	0.6565	0.7656
0.5396	59.0	118	0.6588	0.7656
0.5403	60.0	120	0.6609	0.75
0.5403	61.0	122	0.6621	0.7344
0.5403	62.0	124	0.6619	0.7344
0.5403	63.0	126	0.6614	0.7344
0.5403	64.0	128	0.6599	0.7344
0.5405	65.0	130	0.6586	0.75
0.5405	66.0	132	0.6583	0.7656
0.5405	67.0	134	0.6580	0.7656
0.5405	68.0	136	0.6582	0.75
0.5405	69.0	138	0.6586	0.75
0.5399	70.0	140	0.6591	0.75
0.5399	71.0	142	0.6592	0.75
0.5399	72.0	144	0.6592	0.75
0.5399	73.0	146	0.6594	0.75
0.5399	74.0	148	0.6594	0.75
0.5403	75.0	150	0.6595	0.75

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

bert-large-uncased-sst-2-32-13-smoothed

bert-large-uncased-sst-2-32-13-smoothed

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/bert-large-uncased-sst-2-32-13-smoothed

Evaluation results