bert-large-uncased-sst-2-16-13-smoothed

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6556
  • Accuracy: 0.75

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 75
  • label_smoothing_factor: 0.45

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 1 0.7309 0.5
No log 2.0 2 0.7304 0.5
No log 3.0 3 0.7294 0.5
No log 4.0 4 0.7280 0.5
No log 5.0 5 0.7261 0.5
No log 6.0 6 0.7237 0.5
No log 7.0 7 0.7212 0.5
No log 8.0 8 0.7186 0.5
No log 9.0 9 0.7160 0.5
0.7321 10.0 10 0.7135 0.5
0.7321 11.0 11 0.7113 0.5
0.7321 12.0 12 0.7091 0.5
0.7321 13.0 13 0.7070 0.5
0.7321 14.0 14 0.7051 0.5
0.7321 15.0 15 0.7030 0.4688
0.7321 16.0 16 0.7012 0.4688
0.7321 17.0 17 0.6991 0.5
0.7321 18.0 18 0.6970 0.4688
0.7321 19.0 19 0.6949 0.5
0.6849 20.0 20 0.6930 0.5
0.6849 21.0 21 0.6914 0.4688
0.6849 22.0 22 0.6903 0.4688
0.6849 23.0 23 0.6895 0.5312
0.6849 24.0 24 0.6890 0.4688
0.6849 25.0 25 0.6883 0.4688
0.6849 26.0 26 0.6881 0.4688
0.6849 27.0 27 0.6877 0.5
0.6849 28.0 28 0.6867 0.5312
0.6849 29.0 29 0.6856 0.625
0.6342 30.0 30 0.6838 0.625
0.6342 31.0 31 0.6815 0.625
0.6342 32.0 32 0.6794 0.625
0.6342 33.0 33 0.6766 0.6562
0.6342 34.0 34 0.6739 0.625
0.6342 35.0 35 0.6715 0.625
0.6342 36.0 36 0.6692 0.6562
0.6342 37.0 37 0.6668 0.6875
0.6342 38.0 38 0.6646 0.6875
0.6342 39.0 39 0.6633 0.7188
0.5794 40.0 40 0.6624 0.7188
0.5794 41.0 41 0.6612 0.7188
0.5794 42.0 42 0.6600 0.75
0.5794 43.0 43 0.6601 0.75
0.5794 44.0 44 0.6602 0.75
0.5794 45.0 45 0.6609 0.75
0.5794 46.0 46 0.6629 0.6875
0.5794 47.0 47 0.6647 0.6875
0.5794 48.0 48 0.6635 0.7188
0.5794 49.0 49 0.6626 0.75
0.5487 50.0 50 0.6548 0.75
0.5487 51.0 51 0.6497 0.75
0.5487 52.0 52 0.6488 0.75
0.5487 53.0 53 0.6507 0.6875
0.5487 54.0 54 0.6529 0.6875
0.5487 55.0 55 0.6555 0.7188
0.5487 56.0 56 0.6577 0.7188
0.5487 57.0 57 0.6586 0.7188
0.5487 58.0 58 0.6587 0.7188
0.5487 59.0 59 0.6585 0.7188
0.5401 60.0 60 0.6592 0.6875
0.5401 61.0 61 0.6611 0.6875
0.5401 62.0 62 0.6623 0.6875
0.5401 63.0 63 0.6618 0.6875
0.5401 64.0 64 0.6601 0.6875
0.5401 65.0 65 0.6584 0.6875
0.5401 66.0 66 0.6570 0.7188
0.5401 67.0 67 0.6562 0.7188
0.5401 68.0 68 0.6555 0.7188
0.5401 69.0 69 0.6553 0.75
0.5397 70.0 70 0.6553 0.75
0.5397 71.0 71 0.6552 0.75
0.5397 72.0 72 0.6553 0.75
0.5397 73.0 73 0.6554 0.75
0.5397 74.0 74 0.6555 0.75
0.5397 75.0 75 0.6556 0.75

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for simonycl/bert-large-uncased-sst-2-16-13-smoothed

Finetuned
(116)
this model