scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66
This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cardiffnlp_tweet_sentiment_multilingual_all on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:
- Loss: 93.5296
- Accuracy: 0.4934
- F1: 0.4802
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 32
- seed: 66
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
595.6379 | 1.0875 | 500 | 529.9193 | 0.3356 | 0.2426 |
492.9708 | 2.1751 | 1000 | 462.9991 | 0.3495 | 0.2479 |
434.1923 | 3.2626 | 1500 | 413.0421 | 0.3542 | 0.2526 |
387.9462 | 4.3502 | 2000 | 372.6939 | 0.3461 | 0.2256 |
349.1363 | 5.4377 | 2500 | 339.4581 | 0.3812 | 0.3181 |
316.3708 | 6.5253 | 3000 | 309.5960 | 0.3600 | 0.2482 |
287.2088 | 7.6128 | 3500 | 283.3434 | 0.3623 | 0.2565 |
261.6174 | 8.7004 | 4000 | 260.5347 | 0.3966 | 0.3186 |
238.8991 | 9.7879 | 4500 | 239.4077 | 0.3812 | 0.2898 |
218.7973 | 10.8755 | 5000 | 221.1140 | 0.3827 | 0.2947 |
200.9635 | 11.9630 | 5500 | 204.9407 | 0.3758 | 0.2886 |
184.9722 | 13.0506 | 6000 | 190.5849 | 0.3858 | 0.3143 |
170.6476 | 14.1381 | 6500 | 177.7994 | 0.4024 | 0.3499 |
158.3291 | 15.2257 | 7000 | 167.1440 | 0.4082 | 0.3470 |
147.5937 | 16.3132 | 7500 | 157.3985 | 0.4012 | 0.3365 |
137.7495 | 17.4008 | 8000 | 148.9876 | 0.4390 | 0.4092 |
129.3039 | 18.4883 | 8500 | 141.1713 | 0.4329 | 0.3858 |
121.8619 | 19.5759 | 9000 | 134.6723 | 0.3819 | 0.2814 |
115.5525 | 20.6634 | 9500 | 129.3189 | 0.4375 | 0.4160 |
109.919 | 21.7510 | 10000 | 124.4156 | 0.4390 | 0.4158 |
104.8916 | 22.8385 | 10500 | 120.0901 | 0.4633 | 0.4634 |
100.5612 | 23.9260 | 11000 | 116.1249 | 0.4367 | 0.3906 |
96.8756 | 25.0136 | 11500 | 113.0662 | 0.4275 | 0.3621 |
93.5406 | 26.1011 | 12000 | 110.1482 | 0.4198 | 0.3537 |
90.7912 | 27.1887 | 12500 | 107.7850 | 0.4429 | 0.4133 |
88.2387 | 28.2762 | 13000 | 105.7622 | 0.4587 | 0.4306 |
86.0144 | 29.3638 | 13500 | 103.9457 | 0.4691 | 0.4589 |
84.1688 | 30.4513 | 14000 | 102.5328 | 0.4888 | 0.4795 |
82.6269 | 31.5389 | 14500 | 101.1704 | 0.4560 | 0.4397 |
81.273 | 32.6264 | 15000 | 100.1563 | 0.4865 | 0.4733 |
79.7455 | 33.7140 | 15500 | 98.7990 | 0.4796 | 0.4638 |
78.7022 | 34.8015 | 16000 | 97.9996 | 0.5058 | 0.5022 |
77.7623 | 35.8891 | 16500 | 97.3651 | 0.4834 | 0.4689 |
76.9524 | 36.9766 | 17000 | 96.6101 | 0.4792 | 0.4614 |
76.1552 | 38.0642 | 17500 | 96.0791 | 0.4834 | 0.4741 |
75.4054 | 39.1517 | 18000 | 95.6724 | 0.4958 | 0.4866 |
74.8692 | 40.2393 | 18500 | 95.0627 | 0.5012 | 0.4933 |
74.3489 | 41.3268 | 19000 | 94.8155 | 0.5085 | 0.5056 |
73.8464 | 42.4144 | 19500 | 94.4659 | 0.4691 | 0.4298 |
73.4083 | 43.5019 | 20000 | 94.2774 | 0.5008 | 0.4902 |
73.2075 | 44.5895 | 20500 | 93.8413 | 0.4776 | 0.4563 |
72.8988 | 45.6770 | 21000 | 93.7901 | 0.4988 | 0.4833 |
72.495 | 46.7645 | 21500 | 93.6844 | 0.4942 | 0.4805 |
72.3899 | 47.8521 | 22000 | 93.5561 | 0.4726 | 0.4454 |
72.2684 | 48.9396 | 22500 | 93.5296 | 0.4934 | 0.4802 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.19.1
- Downloads last month
- 9
Model tree for haryoaw/scenario-KD-SCR-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66
Base model
microsoft/mdeberta-v3-base
Finetuned
haryoaw/scenario-MDBT-TCR-TSM