Edit model card

clm-gpt2

This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5054
  • Accuracy: 0.6325

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
2.4536 0.1302 500 2.1316 0.4955
2.1054 0.2603 1000 2.0124 0.5221
1.9756 0.3905 1500 1.9025 0.5453
1.8863 0.5206 2000 1.8367 0.5601
1.8283 0.6508 2500 1.7927 0.5686
1.7893 0.7809 3000 1.7585 0.5760
1.7555 0.9111 3500 1.7328 0.5815
1.7143 1.0413 4000 1.7016 0.5882
1.6697 1.1714 4500 1.6813 0.5930
1.6584 1.3016 5000 1.6615 0.5972
1.6438 1.4317 5500 1.6422 0.6009
1.6184 1.5619 6000 1.6236 0.6049
1.6086 1.6920 6500 1.6102 0.6082
1.5882 1.8222 7000 1.5938 0.6114
1.5719 1.9524 7500 1.5786 0.6148
1.5272 2.0825 8000 1.5718 0.6175
1.4971 2.2127 8500 1.5593 0.6204
1.4893 2.3428 9000 1.5475 0.6227
1.4808 2.4730 9500 1.5382 0.6251
1.4689 2.6031 10000 1.5274 0.6275
1.4572 2.7333 10500 1.5169 0.6298
1.4488 2.8635 11000 1.5106 0.6315
1.4465 2.9936 11500 1.5054 0.6325

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
21
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for frett/clm-gpt2

Finetuned
(1141)
this model