LongRiver/distilbert-base-cased-finetuned

This model is a fine-tuned version of distilbert-base-cased on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.0150
  • Train End Logits Accuracy: 0.9962
  • Train Start Logits Accuracy: 0.9947
  • Validation Loss: 4.6938
  • Validation End Logits Accuracy: 0.5474
  • Validation Start Logits Accuracy: 0.5004
  • Epoch: 29

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 67860, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
  • training_precision: float32

Training results

Train Loss Train End Logits Accuracy Train Start Logits Accuracy Validation Loss Validation End Logits Accuracy Validation Start Logits Accuracy Epoch
2.3620 0.5061 0.4972 2.0785 0.5155 0.4836 0
1.7007 0.5940 0.5660 2.0332 0.5185 0.4999 1
1.4088 0.6542 0.6191 2.0391 0.5324 0.5012 2
1.1407 0.7150 0.6757 2.1645 0.5172 0.4854 3
0.9215 0.7670 0.7296 2.2074 0.5365 0.4995 4
0.7376 0.8083 0.7780 2.4099 0.5146 0.4865 5
0.5780 0.8456 0.8186 2.6543 0.5231 0.4764 6
0.4614 0.8748 0.8511 2.6688 0.5360 0.4944 7
0.3633 0.9015 0.8785 2.9329 0.5300 0.4908 8
0.2981 0.9177 0.8983 3.1868 0.5270 0.4759 9
0.2453 0.9318 0.9156 3.3015 0.5347 0.4951 10
0.1958 0.9440 0.9333 3.5149 0.5335 0.4860 11
0.1649 0.9521 0.9433 3.4351 0.5424 0.4975 12
0.1425 0.9590 0.9505 3.6372 0.5264 0.4800 13
0.1231 0.9644 0.9579 3.7467 0.5346 0.4827 14
0.1024 0.9703 0.9636 3.8551 0.5400 0.4945 15
0.0882 0.9730 0.9692 3.9909 0.5412 0.4880 16
0.0740 0.9785 0.9738 4.0573 0.5376 0.4920 17
0.0691 0.9789 0.9760 4.0751 0.5292 0.4903 18
0.0588 0.9837 0.9792 4.0823 0.5377 0.4967 19
0.0498 0.9849 0.9826 4.2466 0.5376 0.4967 20
0.0464 0.9864 0.9848 4.2565 0.5446 0.4999 21
0.0388 0.9889 0.9864 4.3063 0.5329 0.4941 22
0.0331 0.9900 0.9894 4.4083 0.5420 0.4962 23
0.0274 0.9922 0.9914 4.5627 0.5455 0.5023 24
0.0257 0.9925 0.9916 4.6541 0.5503 0.5122 25
0.0229 0.9935 0.9925 4.4773 0.5433 0.4985 26
0.0181 0.9951 0.9943 4.6989 0.5480 0.5066 27
0.0161 0.9953 0.9947 4.6873 0.5466 0.4995 28
0.0150 0.9962 0.9947 4.6938 0.5474 0.5004 29

Framework versions

  • Transformers 4.39.3
  • TensorFlow 2.15.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for LongRiver/distilbert-base-cased-finetuned

Finetuned
(232)
this model