lmind_nq_train6000_eval6489_v1_qa_1e-4_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_qa dataset. It achieves the following results on the evaluation set:

Loss: 2.0414
Accuracy: 0.6011

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 2
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 32
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.05
num_epochs: 50.0

Training results

Training Loss	Epoch	Step	Accuracy	Validation Loss
1.598	1.0	187	0.6147	1.2692
1.1923	2.0	375	0.6176	1.2733
0.9732	3.0	562	0.6136	1.3396
0.7763	4.0	750	0.6104	1.4358
0.6498	5.0	937	0.6052	1.5630
0.57	6.0	1125	0.6031	1.6599
0.5253	7.0	1312	0.6027	1.7480
0.4958	8.0	1500	0.6021	1.8060
0.4521	9.0	1687	0.6013	1.8599
0.443	10.0	1875	0.6013	1.9468
0.439	11.0	2062	0.6015	1.9500
0.433	12.0	2250	0.6021	1.9104
0.4323	13.0	2437	0.6001	2.0079
0.4281	14.0	2625	0.6008	1.9881
0.4277	15.0	2812	0.6005	2.0305
0.4298	16.0	3000	0.6005	2.0478
0.4082	17.0	3187	0.6007	2.0539
0.411	18.0	3375	0.6005	2.0314
0.4113	19.0	3562	0.6011	2.0368
0.4121	20.0	3750	0.6017	2.1022
0.414	21.0	3937	0.6007	2.0512
0.4163	22.0	4125	0.6016	2.1147
0.4172	23.0	4312	0.6007	2.0942
0.4156	24.0	4500	0.6008	2.1201
0.3997	25.0	4687	0.6010	2.0660
0.3994	26.0	4875	0.6006	2.0832
0.4032	27.0	5062	0.6003	2.1423
0.4058	28.0	5250	0.6015	2.1000
0.4065	29.0	5437	0.6009	2.1065
0.4068	30.0	5625	0.6006	2.1389
0.4091	31.0	5812	0.6005	2.1241
0.4103	32.0	6000	0.6010	2.1241
0.3959	33.0	6187	0.6021	2.1206
0.3974	34.0	6375	0.6017	2.1061
0.3983	35.0	6562	0.6013	2.1041
0.4034	36.0	6750	0.6017	2.0843
0.4035	37.0	6937	0.6035	2.0837
0.4013	38.0	7125	0.6015	2.1708
0.4063	39.0	7312	0.602	2.0946
0.4049	40.0	7500	0.6019	2.1671
0.391	41.0	7687	0.6026	2.1508
0.3913	42.0	7875	0.5998	2.2062
0.3945	43.0	8062	0.6012	2.2214
0.3953	44.0	8250	0.6005	2.2576
0.3959	45.0	8437	0.6001	2.2755
0.3961	46.0	8625	0.6014	2.3085
0.3982	47.0	8812	0.5992	2.3093
0.4028	48.0	9000	0.6007	2.1926
0.3915	49.0	9187	0.6018	2.0674
0.4009	49.87	9350	0.6011	2.0414

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.14.1

tyzhu
/

lmind_nq_train6000_eval6489_v1_qa_1e-4_lora2

lmind_nq_train6000_eval6489_v1_qa_1e-4_lora2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for tyzhu/lmind_nq_train6000_eval6489_v1_qa_1e-4_lora2

Dataset used to train tyzhu/lmind_nq_train6000_eval6489_v1_qa_1e-4_lora2

Evaluation results