scenario-KD-PO-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss
This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cardiffnlp_tweet_sentiment_multilingual_all on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:
- Loss: 125.3087
- Accuracy: 0.5552
- F1: 0.5557
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 32
- seed: 66
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
196.6841 | 1.0875 | 500 | 166.6670 | 0.3484 | 0.2582 |
168.7515 | 2.1751 | 1000 | 157.8051 | 0.4136 | 0.4041 |
160.6266 | 3.2626 | 1500 | 153.1391 | 0.4471 | 0.4197 |
152.2116 | 4.3502 | 2000 | 150.9196 | 0.4225 | 0.4031 |
145.1439 | 5.4377 | 2500 | 145.2500 | 0.4988 | 0.4938 |
137.2401 | 6.5253 | 3000 | 145.8920 | 0.4992 | 0.4899 |
129.4263 | 7.6128 | 3500 | 139.7630 | 0.5123 | 0.5121 |
122.0069 | 8.7004 | 4000 | 142.0804 | 0.5096 | 0.4979 |
116.0619 | 9.7879 | 4500 | 142.2508 | 0.5297 | 0.5245 |
110.0063 | 10.8755 | 5000 | 136.2582 | 0.5444 | 0.5442 |
105.7 | 11.9630 | 5500 | 133.7262 | 0.5390 | 0.5382 |
101.5969 | 13.0506 | 6000 | 135.1363 | 0.5370 | 0.5366 |
98.3408 | 14.1381 | 6500 | 135.0693 | 0.5417 | 0.5426 |
95.2212 | 15.2257 | 7000 | 134.8993 | 0.5494 | 0.5498 |
92.8186 | 16.3132 | 7500 | 132.5809 | 0.5328 | 0.5321 |
90.2266 | 17.4008 | 8000 | 131.8017 | 0.5328 | 0.5343 |
88.4703 | 18.4883 | 8500 | 131.0873 | 0.5235 | 0.5213 |
87.1873 | 19.5759 | 9000 | 131.4287 | 0.5382 | 0.5379 |
85.7516 | 20.6634 | 9500 | 131.1887 | 0.5455 | 0.5448 |
84.7201 | 21.7510 | 10000 | 129.0503 | 0.5463 | 0.5463 |
83.2881 | 22.8385 | 10500 | 130.1761 | 0.5432 | 0.5438 |
82.0884 | 23.9260 | 11000 | 128.6964 | 0.5355 | 0.5366 |
81.8031 | 25.0136 | 11500 | 128.3357 | 0.5421 | 0.5415 |
80.7617 | 26.1011 | 12000 | 129.1492 | 0.5428 | 0.5441 |
80.189 | 27.1887 | 12500 | 127.5801 | 0.5552 | 0.5565 |
79.3389 | 28.2762 | 13000 | 127.4090 | 0.5370 | 0.5387 |
78.7626 | 29.3638 | 13500 | 129.6783 | 0.5505 | 0.5514 |
78.3347 | 30.4513 | 14000 | 129.3366 | 0.5517 | 0.5505 |
77.8829 | 31.5389 | 14500 | 128.1779 | 0.5421 | 0.5412 |
77.4036 | 32.6264 | 15000 | 128.5850 | 0.5309 | 0.5308 |
76.9029 | 33.7140 | 15500 | 126.4255 | 0.5509 | 0.5509 |
76.5128 | 34.8015 | 16000 | 126.3467 | 0.5424 | 0.5429 |
76.2921 | 35.8891 | 16500 | 125.5003 | 0.5428 | 0.5446 |
75.861 | 36.9766 | 17000 | 126.6583 | 0.5486 | 0.5496 |
75.5675 | 38.0642 | 17500 | 126.0611 | 0.5525 | 0.5519 |
75.2374 | 39.1517 | 18000 | 126.2538 | 0.5451 | 0.5457 |
75.1584 | 40.2393 | 18500 | 126.7418 | 0.5463 | 0.5448 |
74.9139 | 41.3268 | 19000 | 126.7482 | 0.5505 | 0.5491 |
74.565 | 42.4144 | 19500 | 126.0126 | 0.5494 | 0.5507 |
74.3026 | 43.5019 | 20000 | 124.4608 | 0.5563 | 0.5573 |
74.4625 | 44.5895 | 20500 | 125.6297 | 0.5563 | 0.5569 |
74.1991 | 45.6770 | 21000 | 125.5350 | 0.5498 | 0.5497 |
74.0552 | 46.7645 | 21500 | 124.8100 | 0.5552 | 0.5560 |
74.0304 | 47.8521 | 22000 | 126.1095 | 0.5482 | 0.5483 |
73.8794 | 48.9396 | 22500 | 125.3087 | 0.5552 | 0.5557 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.19.1
- Downloads last month
- 6
Model tree for haryoaw/scenario-KD-PO-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss
Base model
microsoft/mdeberta-v3-base
Finetuned
haryoaw/scenario-MDBT-TCR-TSM