metadata

license: llama3
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: meta-llama/Meta-Llama-3-8B
model-index:
  - name: Genpro_Llama3-8b
    results: []

Genpro_Llama3-8b

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.5784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.03
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss
1.3266	0.0634	100	1.3260
1.123	0.1267	200	1.1090
1.0242	0.1901	300	1.0121
1.0228	0.2535	400	0.9520
0.9834	0.3169	500	0.9037
0.9726	0.3802	600	0.8456
0.9003	0.4436	700	0.8270
0.8862	0.5070	800	0.7967
0.7788	0.5703	900	0.7715
0.831	0.6337	1000	0.7528
0.7875	0.6971	1100	0.7319
0.8284	0.7605	1200	0.7097
0.7387	0.8238	1300	0.6927
0.7573	0.8872	1400	0.6735
0.7744	0.9506	1500	0.6668
0.5684	1.0139	1600	0.6487
0.5606	1.0773	1700	0.6378
0.5268	1.1407	1800	0.6363
0.5727	1.2041	1900	0.6269
0.5456	1.2674	2000	0.6196
0.5174	1.3308	2100	0.6146
0.499	1.3942	2200	0.6055
0.5831	1.4575	2300	0.5984
0.4884	1.5209	2400	0.5952
0.5538	1.5843	2500	0.5829
0.5302	1.6477	2600	0.5805
0.5506	1.7110	2700	0.5758
0.5509	1.7744	2800	0.5708
0.5249	1.8378	2900	0.5597
0.5249	1.9011	3000	0.5601
0.4597	1.9645	3100	0.5585
0.383	2.0279	3200	0.5643
0.4115	2.0913	3300	0.5666
0.3928	2.1546	3400	0.5737
0.4634	2.2180	3500	0.5587
0.4093	2.2814	3600	0.5615
0.3724	2.3447	3700	0.5529
0.3846	2.4081	3800	0.5604
0.4206	2.4715	3900	0.5539
0.4803	2.5349	4000	0.5422
0.4319	2.5982	4100	0.5452
0.3762	2.6616	4200	0.5523
0.4472	2.7250	4300	0.5319
0.4048	2.7883	4400	0.5370
0.4227	2.8517	4500	0.5401
0.4407	2.9151	4600	0.5294
0.3998	2.9785	4700	0.5282
0.336	3.0418	4800	0.5504
0.3022	3.1052	4900	0.5608
0.3323	3.1686	5000	0.5584
0.3306	3.2319	5100	0.5560
0.3557	3.2953	5200	0.5478
0.3475	3.3587	5300	0.5656
0.3515	3.4221	5400	0.5520
0.3236	3.4854	5500	0.5479
0.3886	3.5488	5600	0.5436
0.339	3.6122	5700	0.5408
0.3509	3.6755	5800	0.5499
0.3651	3.7389	5900	0.5447
0.3707	3.8023	6000	0.5340
0.3122	3.8657	6100	0.5360
0.3613	3.9290	6200	0.5326
0.364	3.9924	6300	0.5315
0.2418	4.0558	6400	0.5719
0.2349	4.1191	6500	0.5686
0.2366	4.1825	6600	0.5750
0.2433	4.2459	6700	0.5739
0.2566	4.3093	6800	0.5664
0.2524	4.3726	6900	0.5798
0.2667	4.4360	7000	0.5570
0.2528	4.4994	7100	0.5573
0.2348	4.5627	7200	0.5723
0.2629	4.6261	7300	0.5742
0.2705	4.6895	7400	0.5743
0.2893	4.7529	7500	0.5560
0.2371	4.8162	7600	0.5652
0.287	4.8796	7700	0.5436
0.2725	4.9430	7800	0.5784

Framework versions

PEFT 0.11.1
Transformers 4.41.1
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1