Genpro_Llama3-8b / README.md
Lalith16's picture
End of training
07e9628 verified
metadata
license: llama3
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: meta-llama/Meta-Llama-3-8B
model-index:
  - name: Genpro_Llama3-8b
    results: []

Genpro_Llama3-8b

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
1.3266 0.0634 100 1.3260
1.123 0.1267 200 1.1090
1.0242 0.1901 300 1.0121
1.0228 0.2535 400 0.9520
0.9834 0.3169 500 0.9037
0.9726 0.3802 600 0.8456
0.9003 0.4436 700 0.8270
0.8862 0.5070 800 0.7967
0.7788 0.5703 900 0.7715
0.831 0.6337 1000 0.7528
0.7875 0.6971 1100 0.7319
0.8284 0.7605 1200 0.7097
0.7387 0.8238 1300 0.6927
0.7573 0.8872 1400 0.6735
0.7744 0.9506 1500 0.6668
0.5684 1.0139 1600 0.6487
0.5606 1.0773 1700 0.6378
0.5268 1.1407 1800 0.6363
0.5727 1.2041 1900 0.6269
0.5456 1.2674 2000 0.6196
0.5174 1.3308 2100 0.6146
0.499 1.3942 2200 0.6055
0.5831 1.4575 2300 0.5984
0.4884 1.5209 2400 0.5952
0.5538 1.5843 2500 0.5829
0.5302 1.6477 2600 0.5805
0.5506 1.7110 2700 0.5758
0.5509 1.7744 2800 0.5708
0.5249 1.8378 2900 0.5597
0.5249 1.9011 3000 0.5601
0.4597 1.9645 3100 0.5585
0.383 2.0279 3200 0.5643
0.4115 2.0913 3300 0.5666
0.3928 2.1546 3400 0.5737
0.4634 2.2180 3500 0.5587
0.4093 2.2814 3600 0.5615
0.3724 2.3447 3700 0.5529
0.3846 2.4081 3800 0.5604
0.4206 2.4715 3900 0.5539
0.4803 2.5349 4000 0.5422
0.4319 2.5982 4100 0.5452
0.3762 2.6616 4200 0.5523
0.4472 2.7250 4300 0.5319
0.4048 2.7883 4400 0.5370
0.4227 2.8517 4500 0.5401
0.4407 2.9151 4600 0.5294
0.3998 2.9785 4700 0.5282
0.336 3.0418 4800 0.5504
0.3022 3.1052 4900 0.5608
0.3323 3.1686 5000 0.5584
0.3306 3.2319 5100 0.5560
0.3557 3.2953 5200 0.5478
0.3475 3.3587 5300 0.5656
0.3515 3.4221 5400 0.5520
0.3236 3.4854 5500 0.5479
0.3886 3.5488 5600 0.5436
0.339 3.6122 5700 0.5408
0.3509 3.6755 5800 0.5499
0.3651 3.7389 5900 0.5447
0.3707 3.8023 6000 0.5340
0.3122 3.8657 6100 0.5360
0.3613 3.9290 6200 0.5326
0.364 3.9924 6300 0.5315
0.2418 4.0558 6400 0.5719
0.2349 4.1191 6500 0.5686
0.2366 4.1825 6600 0.5750
0.2433 4.2459 6700 0.5739
0.2566 4.3093 6800 0.5664
0.2524 4.3726 6900 0.5798
0.2667 4.4360 7000 0.5570
0.2528 4.4994 7100 0.5573
0.2348 4.5627 7200 0.5723
0.2629 4.6261 7300 0.5742
0.2705 4.6895 7400 0.5743
0.2893 4.7529 7500 0.5560
0.2371 4.8162 7600 0.5652
0.287 4.8796 7700 0.5436
0.2725 4.9430 7800 0.5784

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1