bert-large-uncased-sst-2-16-13

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6280
Accuracy: 0.7812

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	1	0.7881	0.5
No log	2.0	2	0.7873	0.5
No log	3.0	3	0.7860	0.5
No log	4.0	4	0.7840	0.5
No log	5.0	5	0.7810	0.5
No log	6.0	6	0.7772	0.5
No log	7.0	7	0.7723	0.5
No log	8.0	8	0.7668	0.5
No log	9.0	9	0.7600	0.5
0.782	10.0	10	0.7522	0.5
0.782	11.0	11	0.7438	0.5
0.782	12.0	12	0.7344	0.5
0.782	13.0	13	0.7252	0.5
0.782	14.0	14	0.7148	0.5
0.782	15.0	15	0.7043	0.5
0.782	16.0	16	0.6943	0.5
0.782	17.0	17	0.6857	0.4688
0.782	18.0	18	0.6769	0.5
0.782	19.0	19	0.6674	0.5312
0.685	20.0	20	0.6591	0.5938
0.685	21.0	21	0.6526	0.625
0.685	22.0	22	0.6435	0.625
0.685	23.0	23	0.6347	0.5938
0.685	24.0	24	0.6278	0.625
0.685	25.0	25	0.6261	0.5938
0.685	26.0	26	0.6250	0.625
0.685	27.0	27	0.6247	0.625
0.685	28.0	28	0.6225	0.625
0.685	29.0	29	0.6159	0.6562
0.4699	30.0	30	0.6056	0.6562
0.4699	31.0	31	0.5906	0.6875
0.4699	32.0	32	0.5795	0.6875
0.4699	33.0	33	0.5844	0.7812
0.4699	34.0	34	0.5925	0.7188
0.4699	35.0	35	0.5942	0.7188
0.4699	36.0	36	0.5956	0.6875
0.4699	37.0	37	0.5921	0.6875
0.4699	38.0	38	0.5860	0.6875
0.4699	39.0	39	0.5844	0.6875
0.3039	40.0	40	0.5793	0.7188
0.3039	41.0	41	0.5738	0.75
0.3039	42.0	42	0.5734	0.75
0.3039	43.0	43	0.5744	0.75
0.3039	44.0	44	0.5782	0.6875
0.3039	45.0	45	0.5817	0.6875
0.3039	46.0	46	0.5858	0.6875
0.3039	47.0	47	0.5888	0.6875
0.3039	48.0	48	0.5836	0.6875
0.3039	49.0	49	0.5724	0.7188
0.1969	50.0	50	0.5572	0.7188
0.1969	51.0	51	0.5442	0.7812
0.1969	52.0	52	0.5347	0.7812
0.1969	53.0	53	0.5288	0.7812
0.1969	54.0	54	0.5284	0.75
0.1969	55.0	55	0.5307	0.7812
0.1969	56.0	56	0.5386	0.7812
0.1969	57.0	57	0.5475	0.75
0.1969	58.0	58	0.5535	0.75
0.1969	59.0	59	0.5550	0.7188
0.1348	60.0	60	0.5533	0.7188
0.1348	61.0	61	0.5412	0.7812
0.1348	62.0	62	0.5322	0.7812
0.1348	63.0	63	0.5256	0.8125
0.1348	64.0	64	0.5189	0.8125
0.1348	65.0	65	0.5148	0.8125
0.1348	66.0	66	0.5154	0.7812
0.1348	67.0	67	0.5162	0.75
0.1348	68.0	68	0.5202	0.75
0.1348	69.0	69	0.5255	0.75
0.0823	70.0	70	0.5330	0.75
0.0823	71.0	71	0.5367	0.75
0.0823	72.0	72	0.5413	0.75
0.0823	73.0	73	0.5434	0.75
0.0823	74.0	74	0.5415	0.75
0.0823	75.0	75	0.5395	0.75
0.0823	76.0	76	0.5394	0.75
0.0823	77.0	77	0.5380	0.75
0.0823	78.0	78	0.5379	0.75
0.0823	79.0	79	0.5396	0.75
0.0519	80.0	80	0.5426	0.75
0.0519	81.0	81	0.5426	0.75
0.0519	82.0	82	0.5419	0.75
0.0519	83.0	83	0.5446	0.75
0.0519	84.0	84	0.5467	0.75
0.0519	85.0	85	0.5487	0.75
0.0519	86.0	86	0.5522	0.75
0.0519	87.0	87	0.5566	0.75
0.0519	88.0	88	0.5614	0.75
0.0519	89.0	89	0.5672	0.75
0.0382	90.0	90	0.5713	0.75
0.0382	91.0	91	0.5744	0.75
0.0382	92.0	92	0.5773	0.75
0.0382	93.0	93	0.5799	0.75
0.0382	94.0	94	0.5806	0.75
0.0382	95.0	95	0.5777	0.75
0.0382	96.0	96	0.5761	0.75
0.0382	97.0	97	0.5746	0.75
0.0382	98.0	98	0.5710	0.7812
0.0382	99.0	99	0.5697	0.7812
0.0266	100.0	100	0.5676	0.7812
0.0266	101.0	101	0.5650	0.7812
0.0266	102.0	102	0.5637	0.7812
0.0266	103.0	103	0.5623	0.7812
0.0266	104.0	104	0.5631	0.7812
0.0266	105.0	105	0.5633	0.7812
0.0266	106.0	106	0.5635	0.7812
0.0266	107.0	107	0.5638	0.8125
0.0266	108.0	108	0.5646	0.7812
0.0266	109.0	109	0.5662	0.7812
0.0205	110.0	110	0.5694	0.7812
0.0205	111.0	111	0.5737	0.7812
0.0205	112.0	112	0.5797	0.7812
0.0205	113.0	113	0.5851	0.7812
0.0205	114.0	114	0.5923	0.7812
0.0205	115.0	115	0.6008	0.7812
0.0205	116.0	116	0.6091	0.7812
0.0205	117.0	117	0.6162	0.75
0.0205	118.0	118	0.6201	0.75
0.0205	119.0	119	0.6233	0.75
0.0168	120.0	120	0.6255	0.75
0.0168	121.0	121	0.6274	0.75
0.0168	122.0	122	0.6293	0.75
0.0168	123.0	123	0.6265	0.75
0.0168	124.0	124	0.6245	0.75
0.0168	125.0	125	0.6239	0.75
0.0168	126.0	126	0.6232	0.75
0.0168	127.0	127	0.6221	0.7812
0.0168	128.0	128	0.6216	0.7812
0.0168	129.0	129	0.6213	0.7812
0.0139	130.0	130	0.6214	0.7812
0.0139	131.0	131	0.6212	0.7812
0.0139	132.0	132	0.6218	0.7812
0.0139	133.0	133	0.6234	0.7812
0.0139	134.0	134	0.6248	0.7812
0.0139	135.0	135	0.6259	0.7812
0.0139	136.0	136	0.6269	0.7812
0.0139	137.0	137	0.6275	0.7812
0.0139	138.0	138	0.6277	0.7812
0.0139	139.0	139	0.6280	0.7812
0.0126	140.0	140	0.6281	0.7812
0.0126	141.0	141	0.6283	0.7812
0.0126	142.0	142	0.6281	0.7812
0.0126	143.0	143	0.6279	0.7812
0.0126	144.0	144	0.6279	0.7812
0.0126	145.0	145	0.6278	0.7812
0.0126	146.0	146	0.6278	0.7812
0.0126	147.0	147	0.6279	0.7812
0.0126	148.0	148	0.6279	0.7812
0.0126	149.0	149	0.6279	0.7812
0.0121	150.0	150	0.6280	0.7812

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

bert-large-uncased-sst-2-16-13

bert-large-uncased-sst-2-16-13

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/bert-large-uncased-sst-2-16-13

Evaluation results