cls_alldata_llama3_v1

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.4523

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 2
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.03
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.6921	0.0582	20	0.6831
0.5975	0.1164	40	0.6416
0.6107	0.1747	60	0.6082
0.5609	0.2329	80	0.5883
0.5857	0.2911	100	0.5761
0.5386	0.3493	120	0.5660
0.5176	0.4076	140	0.5529
0.5317	0.4658	160	0.5379
0.5244	0.5240	180	0.5292
0.5218	0.5822	200	0.5234
0.5003	0.6405	220	0.5207
0.5024	0.6987	240	0.5096
0.4913	0.7569	260	0.5062
0.5174	0.8151	280	0.5003
0.4675	0.8734	300	0.4968
0.5137	0.9316	320	0.4903
0.4883	0.9898	340	0.4869
0.3616	1.0480	360	0.4935
0.3713	1.1063	380	0.4890
0.365	1.1645	400	0.4856
0.3732	1.2227	420	0.4838
0.3717	1.2809	440	0.4842
0.3657	1.3392	460	0.4811
0.3767	1.3974	480	0.4762
0.3859	1.4556	500	0.4763
0.3773	1.5138	520	0.4712
0.3615	1.5721	540	0.4671
0.3656	1.6303	560	0.4666
0.3497	1.6885	580	0.4658
0.3818	1.7467	600	0.4621
0.3759	1.8049	620	0.4626
0.3539	1.8632	640	0.4551
0.3985	1.9214	660	0.4525
0.3668	1.9796	680	0.4523

Framework versions

PEFT 0.11.1
Transformers 4.41.1
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Sorour
/

cls_alldata_llama3_v1

cls_alldata_llama3_v1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Sorour/cls_alldata_llama3_v1

Evaluation results