bert-large-uncased-sst-2-16-13

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6280
  • Accuracy: 0.7812

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 150

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 1 0.7881 0.5
No log 2.0 2 0.7873 0.5
No log 3.0 3 0.7860 0.5
No log 4.0 4 0.7840 0.5
No log 5.0 5 0.7810 0.5
No log 6.0 6 0.7772 0.5
No log 7.0 7 0.7723 0.5
No log 8.0 8 0.7668 0.5
No log 9.0 9 0.7600 0.5
0.782 10.0 10 0.7522 0.5
0.782 11.0 11 0.7438 0.5
0.782 12.0 12 0.7344 0.5
0.782 13.0 13 0.7252 0.5
0.782 14.0 14 0.7148 0.5
0.782 15.0 15 0.7043 0.5
0.782 16.0 16 0.6943 0.5
0.782 17.0 17 0.6857 0.4688
0.782 18.0 18 0.6769 0.5
0.782 19.0 19 0.6674 0.5312
0.685 20.0 20 0.6591 0.5938
0.685 21.0 21 0.6526 0.625
0.685 22.0 22 0.6435 0.625
0.685 23.0 23 0.6347 0.5938
0.685 24.0 24 0.6278 0.625
0.685 25.0 25 0.6261 0.5938
0.685 26.0 26 0.6250 0.625
0.685 27.0 27 0.6247 0.625
0.685 28.0 28 0.6225 0.625
0.685 29.0 29 0.6159 0.6562
0.4699 30.0 30 0.6056 0.6562
0.4699 31.0 31 0.5906 0.6875
0.4699 32.0 32 0.5795 0.6875
0.4699 33.0 33 0.5844 0.7812
0.4699 34.0 34 0.5925 0.7188
0.4699 35.0 35 0.5942 0.7188
0.4699 36.0 36 0.5956 0.6875
0.4699 37.0 37 0.5921 0.6875
0.4699 38.0 38 0.5860 0.6875
0.4699 39.0 39 0.5844 0.6875
0.3039 40.0 40 0.5793 0.7188
0.3039 41.0 41 0.5738 0.75
0.3039 42.0 42 0.5734 0.75
0.3039 43.0 43 0.5744 0.75
0.3039 44.0 44 0.5782 0.6875
0.3039 45.0 45 0.5817 0.6875
0.3039 46.0 46 0.5858 0.6875
0.3039 47.0 47 0.5888 0.6875
0.3039 48.0 48 0.5836 0.6875
0.3039 49.0 49 0.5724 0.7188
0.1969 50.0 50 0.5572 0.7188
0.1969 51.0 51 0.5442 0.7812
0.1969 52.0 52 0.5347 0.7812
0.1969 53.0 53 0.5288 0.7812
0.1969 54.0 54 0.5284 0.75
0.1969 55.0 55 0.5307 0.7812
0.1969 56.0 56 0.5386 0.7812
0.1969 57.0 57 0.5475 0.75
0.1969 58.0 58 0.5535 0.75
0.1969 59.0 59 0.5550 0.7188
0.1348 60.0 60 0.5533 0.7188
0.1348 61.0 61 0.5412 0.7812
0.1348 62.0 62 0.5322 0.7812
0.1348 63.0 63 0.5256 0.8125
0.1348 64.0 64 0.5189 0.8125
0.1348 65.0 65 0.5148 0.8125
0.1348 66.0 66 0.5154 0.7812
0.1348 67.0 67 0.5162 0.75
0.1348 68.0 68 0.5202 0.75
0.1348 69.0 69 0.5255 0.75
0.0823 70.0 70 0.5330 0.75
0.0823 71.0 71 0.5367 0.75
0.0823 72.0 72 0.5413 0.75
0.0823 73.0 73 0.5434 0.75
0.0823 74.0 74 0.5415 0.75
0.0823 75.0 75 0.5395 0.75
0.0823 76.0 76 0.5394 0.75
0.0823 77.0 77 0.5380 0.75
0.0823 78.0 78 0.5379 0.75
0.0823 79.0 79 0.5396 0.75
0.0519 80.0 80 0.5426 0.75
0.0519 81.0 81 0.5426 0.75
0.0519 82.0 82 0.5419 0.75
0.0519 83.0 83 0.5446 0.75
0.0519 84.0 84 0.5467 0.75
0.0519 85.0 85 0.5487 0.75
0.0519 86.0 86 0.5522 0.75
0.0519 87.0 87 0.5566 0.75
0.0519 88.0 88 0.5614 0.75
0.0519 89.0 89 0.5672 0.75
0.0382 90.0 90 0.5713 0.75
0.0382 91.0 91 0.5744 0.75
0.0382 92.0 92 0.5773 0.75
0.0382 93.0 93 0.5799 0.75
0.0382 94.0 94 0.5806 0.75
0.0382 95.0 95 0.5777 0.75
0.0382 96.0 96 0.5761 0.75
0.0382 97.0 97 0.5746 0.75
0.0382 98.0 98 0.5710 0.7812
0.0382 99.0 99 0.5697 0.7812
0.0266 100.0 100 0.5676 0.7812
0.0266 101.0 101 0.5650 0.7812
0.0266 102.0 102 0.5637 0.7812
0.0266 103.0 103 0.5623 0.7812
0.0266 104.0 104 0.5631 0.7812
0.0266 105.0 105 0.5633 0.7812
0.0266 106.0 106 0.5635 0.7812
0.0266 107.0 107 0.5638 0.8125
0.0266 108.0 108 0.5646 0.7812
0.0266 109.0 109 0.5662 0.7812
0.0205 110.0 110 0.5694 0.7812
0.0205 111.0 111 0.5737 0.7812
0.0205 112.0 112 0.5797 0.7812
0.0205 113.0 113 0.5851 0.7812
0.0205 114.0 114 0.5923 0.7812
0.0205 115.0 115 0.6008 0.7812
0.0205 116.0 116 0.6091 0.7812
0.0205 117.0 117 0.6162 0.75
0.0205 118.0 118 0.6201 0.75
0.0205 119.0 119 0.6233 0.75
0.0168 120.0 120 0.6255 0.75
0.0168 121.0 121 0.6274 0.75
0.0168 122.0 122 0.6293 0.75
0.0168 123.0 123 0.6265 0.75
0.0168 124.0 124 0.6245 0.75
0.0168 125.0 125 0.6239 0.75
0.0168 126.0 126 0.6232 0.75
0.0168 127.0 127 0.6221 0.7812
0.0168 128.0 128 0.6216 0.7812
0.0168 129.0 129 0.6213 0.7812
0.0139 130.0 130 0.6214 0.7812
0.0139 131.0 131 0.6212 0.7812
0.0139 132.0 132 0.6218 0.7812
0.0139 133.0 133 0.6234 0.7812
0.0139 134.0 134 0.6248 0.7812
0.0139 135.0 135 0.6259 0.7812
0.0139 136.0 136 0.6269 0.7812
0.0139 137.0 137 0.6275 0.7812
0.0139 138.0 138 0.6277 0.7812
0.0139 139.0 139 0.6280 0.7812
0.0126 140.0 140 0.6281 0.7812
0.0126 141.0 141 0.6283 0.7812
0.0126 142.0 142 0.6281 0.7812
0.0126 143.0 143 0.6279 0.7812
0.0126 144.0 144 0.6279 0.7812
0.0126 145.0 145 0.6278 0.7812
0.0126 146.0 146 0.6278 0.7812
0.0126 147.0 147 0.6279 0.7812
0.0126 148.0 148 0.6279 0.7812
0.0126 149.0 149 0.6279 0.7812
0.0121 150.0 150 0.6280 0.7812

Framework versions

  • Transformers 4.32.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.4.0
  • Tokenizers 0.13.3
Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for simonycl/bert-large-uncased-sst-2-16-13

Finetuned
(115)
this model