scenario-KD-PO-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cardiffnlp_tweet_sentiment_multilingual_all on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:

  • Loss: 125.3087
  • Accuracy: 0.5552
  • F1: 0.5557

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 32
  • seed: 66
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
196.6841 1.0875 500 166.6670 0.3484 0.2582
168.7515 2.1751 1000 157.8051 0.4136 0.4041
160.6266 3.2626 1500 153.1391 0.4471 0.4197
152.2116 4.3502 2000 150.9196 0.4225 0.4031
145.1439 5.4377 2500 145.2500 0.4988 0.4938
137.2401 6.5253 3000 145.8920 0.4992 0.4899
129.4263 7.6128 3500 139.7630 0.5123 0.5121
122.0069 8.7004 4000 142.0804 0.5096 0.4979
116.0619 9.7879 4500 142.2508 0.5297 0.5245
110.0063 10.8755 5000 136.2582 0.5444 0.5442
105.7 11.9630 5500 133.7262 0.5390 0.5382
101.5969 13.0506 6000 135.1363 0.5370 0.5366
98.3408 14.1381 6500 135.0693 0.5417 0.5426
95.2212 15.2257 7000 134.8993 0.5494 0.5498
92.8186 16.3132 7500 132.5809 0.5328 0.5321
90.2266 17.4008 8000 131.8017 0.5328 0.5343
88.4703 18.4883 8500 131.0873 0.5235 0.5213
87.1873 19.5759 9000 131.4287 0.5382 0.5379
85.7516 20.6634 9500 131.1887 0.5455 0.5448
84.7201 21.7510 10000 129.0503 0.5463 0.5463
83.2881 22.8385 10500 130.1761 0.5432 0.5438
82.0884 23.9260 11000 128.6964 0.5355 0.5366
81.8031 25.0136 11500 128.3357 0.5421 0.5415
80.7617 26.1011 12000 129.1492 0.5428 0.5441
80.189 27.1887 12500 127.5801 0.5552 0.5565
79.3389 28.2762 13000 127.4090 0.5370 0.5387
78.7626 29.3638 13500 129.6783 0.5505 0.5514
78.3347 30.4513 14000 129.3366 0.5517 0.5505
77.8829 31.5389 14500 128.1779 0.5421 0.5412
77.4036 32.6264 15000 128.5850 0.5309 0.5308
76.9029 33.7140 15500 126.4255 0.5509 0.5509
76.5128 34.8015 16000 126.3467 0.5424 0.5429
76.2921 35.8891 16500 125.5003 0.5428 0.5446
75.861 36.9766 17000 126.6583 0.5486 0.5496
75.5675 38.0642 17500 126.0611 0.5525 0.5519
75.2374 39.1517 18000 126.2538 0.5451 0.5457
75.1584 40.2393 18500 126.7418 0.5463 0.5448
74.9139 41.3268 19000 126.7482 0.5505 0.5491
74.565 42.4144 19500 126.0126 0.5494 0.5507
74.3026 43.5019 20000 124.4608 0.5563 0.5573
74.4625 44.5895 20500 125.6297 0.5563 0.5569
74.1991 45.6770 21000 125.5350 0.5498 0.5497
74.0552 46.7645 21500 124.8100 0.5552 0.5560
74.0304 47.8521 22000 126.1095 0.5482 0.5483
73.8794 48.9396 22500 125.3087 0.5552 0.5557

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.19.1
Downloads last month
6
Safetensors
Model size
108M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for haryoaw/scenario-KD-PO-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual_all66sss

Finetuned
(13)
this model