scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cardiffnlp_tweet_sentiment_multilingual_all on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:

Loss: 93.5296
Accuracy: 0.4934
F1: 0.4802

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 32
seed: 66
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
595.6379	1.0875	500	529.9193	0.3356	0.2426
492.9708	2.1751	1000	462.9991	0.3495	0.2479
434.1923	3.2626	1500	413.0421	0.3542	0.2526
387.9462	4.3502	2000	372.6939	0.3461	0.2256
349.1363	5.4377	2500	339.4581	0.3812	0.3181
316.3708	6.5253	3000	309.5960	0.3600	0.2482
287.2088	7.6128	3500	283.3434	0.3623	0.2565
261.6174	8.7004	4000	260.5347	0.3966	0.3186
238.8991	9.7879	4500	239.4077	0.3812	0.2898
218.7973	10.8755	5000	221.1140	0.3827	0.2947
200.9635	11.9630	5500	204.9407	0.3758	0.2886
184.9722	13.0506	6000	190.5849	0.3858	0.3143
170.6476	14.1381	6500	177.7994	0.4024	0.3499
158.3291	15.2257	7000	167.1440	0.4082	0.3470
147.5937	16.3132	7500	157.3985	0.4012	0.3365
137.7495	17.4008	8000	148.9876	0.4390	0.4092
129.3039	18.4883	8500	141.1713	0.4329	0.3858
121.8619	19.5759	9000	134.6723	0.3819	0.2814
115.5525	20.6634	9500	129.3189	0.4375	0.4160
109.919	21.7510	10000	124.4156	0.4390	0.4158
104.8916	22.8385	10500	120.0901	0.4633	0.4634
100.5612	23.9260	11000	116.1249	0.4367	0.3906
96.8756	25.0136	11500	113.0662	0.4275	0.3621
93.5406	26.1011	12000	110.1482	0.4198	0.3537
90.7912	27.1887	12500	107.7850	0.4429	0.4133
88.2387	28.2762	13000	105.7622	0.4587	0.4306
86.0144	29.3638	13500	103.9457	0.4691	0.4589
84.1688	30.4513	14000	102.5328	0.4888	0.4795
82.6269	31.5389	14500	101.1704	0.4560	0.4397
81.273	32.6264	15000	100.1563	0.4865	0.4733
79.7455	33.7140	15500	98.7990	0.4796	0.4638
78.7022	34.8015	16000	97.9996	0.5058	0.5022
77.7623	35.8891	16500	97.3651	0.4834	0.4689
76.9524	36.9766	17000	96.6101	0.4792	0.4614
76.1552	38.0642	17500	96.0791	0.4834	0.4741
75.4054	39.1517	18000	95.6724	0.4958	0.4866
74.8692	40.2393	18500	95.0627	0.5012	0.4933
74.3489	41.3268	19000	94.8155	0.5085	0.5056
73.8464	42.4144	19500	94.4659	0.4691	0.4298
73.4083	43.5019	20000	94.2774	0.5008	0.4902
73.2075	44.5895	20500	93.8413	0.4776	0.4563
72.8988	45.6770	21000	93.7901	0.4988	0.4833
72.495	46.7645	21500	93.6844	0.4942	0.4805
72.3899	47.8521	22000	93.5561	0.4726	0.4454
72.2684	48.9396	22500	93.5296	0.4934	0.4802

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66

scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66

Evaluation results