Pyro-X2's picture
End of training
14a9fcd verified
|
raw
history blame
2.79 kB
metadata
license: apache-2.0
base_model: EleutherAI/pythia-70m
tags:
  - generated_from_trainer
model-index:
  - name: polish_wikipedia_model
    results: []

polish_wikipedia_model

This model is a fine-tuned version of EleutherAI/pythia-70m on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0319

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 133 0.4322
No log 2.0 266 0.3766
No log 3.0 399 0.3599
0.4669 4.0 532 0.3247
0.4669 5.0 665 0.2830
0.4669 6.0 798 0.2628
0.4669 7.0 931 0.2573
0.3481 8.0 1064 0.2443
0.3481 9.0 1197 0.1904
0.3481 10.0 1330 0.1799
0.3481 11.0 1463 0.1475
0.2502 12.0 1596 0.1292
0.2502 13.0 1729 0.1168
0.2502 14.0 1862 0.1103
0.2502 15.0 1995 0.0989
0.1572 16.0 2128 0.0890
0.1572 17.0 2261 0.0736
0.1572 18.0 2394 0.0672
0.1007 19.0 2527 0.0592
0.1007 20.0 2660 0.0550
0.1007 21.0 2793 0.0517
0.1007 22.0 2926 0.0497
0.0674 23.0 3059 0.0458
0.0674 24.0 3192 0.0421
0.0674 25.0 3325 0.0394
0.0674 26.0 3458 0.0378
0.0491 27.0 3591 0.0357
0.0491 28.0 3724 0.0337
0.0491 29.0 3857 0.0323
0.0491 30.0 3990 0.0319

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1