Edit model card

h0-1

This model is a fine-tuned version of microsoft/CodeGPT-small-py on hearthstone dataset. GitHub repo. It achieves the following results on the evaluation set:

  • Loss: 0.3622
  • Exact Match: 0.1970
  • Bleu: 0.9193
  • Codebleu: 0.7686
  • Chrf: 93.5686

Model description

CodeGPT-small-py fine-tuned on HearthStone dataset for 200 epochs

Intended uses & limitations

HearthStone card code synthesis.

Training and evaluation data

See split of hearthstone dataset

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 17
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Exact Match Bleu Codebleu Chrf
0.2482 11.94 1600 0.2828 0.1364 0.9012 0.7012 92.2247
0.0203 23.88 3200 0.2968 0.1970 0.9114 0.7298 93.0236
0.0082 35.82 4800 0.3049 0.1970 0.9125 0.7480 93.1997
0.0049 47.76 6400 0.3190 0.1818 0.9125 0.7526 93.0967
0.0038 59.7 8000 0.3289 0.1818 0.9117 0.7348 93.1293
0.0024 71.64 9600 0.3358 0.1970 0.9142 0.7555 93.0747
0.0022 83.58 11200 0.3379 0.1970 0.9164 0.7642 93.2931
0.0013 95.52 12800 0.3444 0.2121 0.9189 0.7700 93.4456
0.0009 107.46 14400 0.3408 0.1970 0.9188 0.7655 93.4808
0.0006 119.4 16000 0.3522 0.1970 0.9177 0.7510 93.4061
0.0003 131.34 17600 0.3589 0.2121 0.9178 0.7614 93.3980
0.0002 143.28 19200 0.3562 0.2121 0.9179 0.7634 93.5130
0.0002 155.22 20800 0.3624 0.1970 0.9208 0.7699 93.6707
0.0001 167.16 22400 0.3608 0.1970 0.9193 0.7703 93.6082
0.0001 179.1 24000 0.3620 0.1970 0.9190 0.7667 93.5154
0.0001 191.04 25600 0.3622 0.1970 0.9193 0.7686 93.5686

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.13.0
  • Datasets 2.6.1
  • Tokenizers 0.13.1
Downloads last month
20
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dvitel/h0-1

Evaluation results