vedantjumle's picture
Training in progress epoch 48
fc926bd
|
raw
history blame
4.55 kB
metadata
license: mit
base_model: gpt2
tags:
  - generated_from_keras_callback
model-index:
  - name: vedantjumle/indo-ml-final-test-gpt2
    results: []

vedantjumle/indo-ml-final-test-gpt2

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.0791
  • Validation Loss: 0.5970
  • Train Accuracy: 0.86
  • Epoch: 48

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 3000, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
  • training_precision: float32

Training results

Train Loss Validation Loss Train Accuracy Epoch
6.1590 5.0940 0.01 0
5.0418 4.9883 0.0133 1
4.9133 4.8504 0.0333 2
4.7401 4.6073 0.07 3
4.3767 3.9978 0.1767 4
3.6892 3.2744 0.35 5
2.9908 2.6567 0.4933 6
2.3695 2.2079 0.6033 7
1.9372 1.8126 0.66 8
1.5314 1.5588 0.7133 9
1.2590 1.3589 0.7333 10
1.0342 1.2366 0.7433 11
0.8585 1.1181 0.77 12
0.7366 1.0283 0.78 13
0.6208 0.9584 0.7933 14
0.5448 0.9084 0.8133 15
0.4745 0.8591 0.8033 16
0.4187 0.8293 0.83 17
0.3628 0.7953 0.84 18
0.3299 0.7676 0.8467 19
0.3072 0.7536 0.8267 20
0.2794 0.7395 0.8367 21
0.2370 0.7114 0.8567 22
0.2203 0.6990 0.8467 23
0.2104 0.6906 0.8433 24
0.1838 0.6815 0.86 25
0.1680 0.6633 0.8533 26
0.1650 0.6629 0.8533 27
0.1558 0.6536 0.8567 28
0.1482 0.6499 0.86 29
0.1404 0.6465 0.86 30
0.1340 0.6385 0.8567 31
0.1226 0.6313 0.8533 32
0.1212 0.6257 0.86 33
0.1120 0.6220 0.86 34
0.1084 0.6271 0.86 35
0.1043 0.6172 0.86 36
0.1046 0.6173 0.86 37
0.0989 0.6127 0.86 38
0.0969 0.6106 0.86 39
0.0918 0.6161 0.8633 40
0.0916 0.6062 0.86 41
0.0892 0.6037 0.86 42
0.0822 0.6037 0.86 43
0.0865 0.5968 0.86 44
0.0819 0.5992 0.86 45
0.0847 0.5988 0.86 46
0.0805 0.5971 0.86 47
0.0791 0.5970 0.86 48

Framework versions

  • Transformers 4.33.2
  • TensorFlow 2.13.0
  • Datasets 2.14.5
  • Tokenizers 0.13.3