scenario-KD-PO-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cardiffnlp_tweet_sentiment_multilingual_all on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:

Loss: 125.3087
Accuracy: 0.5552
F1: 0.5557

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 32
seed: 66
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
196.6841	1.0875	500	166.6670	0.3484	0.2582
168.7515	2.1751	1000	157.8051	0.4136	0.4041
160.6266	3.2626	1500	153.1391	0.4471	0.4197
152.2116	4.3502	2000	150.9196	0.4225	0.4031
145.1439	5.4377	2500	145.2500	0.4988	0.4938
137.2401	6.5253	3000	145.8920	0.4992	0.4899
129.4263	7.6128	3500	139.7630	0.5123	0.5121
122.0069	8.7004	4000	142.0804	0.5096	0.4979
116.0619	9.7879	4500	142.2508	0.5297	0.5245
110.0063	10.8755	5000	136.2582	0.5444	0.5442
105.7	11.9630	5500	133.7262	0.5390	0.5382
101.5969	13.0506	6000	135.1363	0.5370	0.5366
98.3408	14.1381	6500	135.0693	0.5417	0.5426
95.2212	15.2257	7000	134.8993	0.5494	0.5498
92.8186	16.3132	7500	132.5809	0.5328	0.5321
90.2266	17.4008	8000	131.8017	0.5328	0.5343
88.4703	18.4883	8500	131.0873	0.5235	0.5213
87.1873	19.5759	9000	131.4287	0.5382	0.5379
85.7516	20.6634	9500	131.1887	0.5455	0.5448
84.7201	21.7510	10000	129.0503	0.5463	0.5463
83.2881	22.8385	10500	130.1761	0.5432	0.5438
82.0884	23.9260	11000	128.6964	0.5355	0.5366
81.8031	25.0136	11500	128.3357	0.5421	0.5415
80.7617	26.1011	12000	129.1492	0.5428	0.5441
80.189	27.1887	12500	127.5801	0.5552	0.5565
79.3389	28.2762	13000	127.4090	0.5370	0.5387
78.7626	29.3638	13500	129.6783	0.5505	0.5514
78.3347	30.4513	14000	129.3366	0.5517	0.5505
77.8829	31.5389	14500	128.1779	0.5421	0.5412
77.4036	32.6264	15000	128.5850	0.5309	0.5308
76.9029	33.7140	15500	126.4255	0.5509	0.5509
76.5128	34.8015	16000	126.3467	0.5424	0.5429
76.2921	35.8891	16500	125.5003	0.5428	0.5446
75.861	36.9766	17000	126.6583	0.5486	0.5496
75.5675	38.0642	17500	126.0611	0.5525	0.5519
75.2374	39.1517	18000	126.2538	0.5451	0.5457
75.1584	40.2393	18500	126.7418	0.5463	0.5448
74.9139	41.3268	19000	126.7482	0.5505	0.5491
74.565	42.4144	19500	126.0126	0.5494	0.5507
74.3026	43.5019	20000	124.4608	0.5563	0.5573
74.4625	44.5895	20500	125.6297	0.5563	0.5569
74.1991	45.6770	21000	125.5350	0.5498	0.5497
74.0552	46.7645	21500	124.8100	0.5552	0.5560
74.0304	47.8521	22000	126.1095	0.5482	0.5483
73.8794	48.9396	22500	125.3087	0.5552	0.5557

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-KD-PO-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

scenario-KD-PO-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-PO-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

Evaluation results