distilbert_oscarth_0060

This model is a fine-tuned version of distilbert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 1.1876
  • Validation Loss: 1.1378
  • Epoch: 59

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
4.1327 2.9983 0
2.7813 2.4562 1
2.4194 2.2066 2
2.2231 2.0562 3
2.0894 1.9450 4
1.9905 1.8621 5
1.9148 1.7941 6
1.8508 1.7363 7
1.7976 1.6909 8
1.7509 1.6488 9
1.7126 1.6124 10
1.6764 1.5835 11
1.6450 1.5521 12
1.6175 1.5282 13
1.5919 1.5045 14
1.5679 1.4833 15
1.5476 1.4627 16
1.5271 1.4498 17
1.5098 1.4270 18
1.4909 1.4161 19
1.4760 1.3995 20
1.4609 1.3864 21
1.4475 1.3717 22
1.4333 1.3590 23
1.4203 1.3478 24
1.4093 1.3403 25
1.3980 1.3296 26
1.3875 1.3176 27
1.3773 1.3094 28
1.3674 1.3011 29
1.3579 1.2920 30
1.3497 1.2826 31
1.3400 1.2764 32
1.3326 1.2694 33
1.3236 1.2635 34
1.3169 1.2536 35
1.3096 1.2477 36
1.3024 1.2408 37
1.2957 1.2364 38
1.2890 1.2296 39
1.2818 1.2236 40
1.2751 1.2168 41
1.2691 1.2126 42
1.2644 1.2044 43
1.2583 1.2008 44
1.2529 1.1962 45
1.2473 1.1919 46
1.2416 1.1857 47
1.2365 1.1812 48
1.2318 1.1765 49
1.2273 1.1738 50
1.2224 1.1672 51
1.2177 1.1673 52
1.2132 1.1595 53
1.2084 1.1564 54
1.2033 1.1518 55
1.1993 1.1481 56
1.1966 1.1445 57
1.1924 1.1412 58
1.1876 1.1378 59

Framework versions

  • Transformers 4.20.1
  • TensorFlow 2.8.2
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.