lmind_nq_train6000_eval6489_v1_docidx_v3_5e-4_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3 dataset. It achieves the following results on the evaluation set:

Loss: 8.0353
Accuracy: 0.1945

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 32
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.05
num_epochs: 50.0

Training results

Training Loss	Epoch	Step	Accuracy	Validation Loss
1.3903	1.0	341	0.4564	3.9959
1.2	2.0	683	0.4396	4.5103
0.9155	3.0	1024	0.4285	4.8751
0.6446	4.0	1366	0.4326	4.8178
0.455	5.0	1707	0.4404	4.9434
0.3104	6.0	2049	0.4313	5.2226
0.2187	7.0	2390	0.4296	5.1633
0.1903	8.0	2732	0.4377	5.0743
0.1639	9.0	3073	0.4339	5.2491
0.1685	10.0	3415	0.4356	5.1677
0.1575	11.0	3756	0.4358	5.0421
0.151	12.0	4098	0.4338	5.1801
0.1586	13.0	4439	0.4347	5.2149
0.1492	14.0	4781	0.4356	5.1413
0.1539	15.0	5122	0.4309	5.2818
0.1472	16.0	5464	0.4372	5.0858
0.1503	17.0	5805	0.4341	5.1719
0.1449	18.0	6147	0.4301	5.3105
0.1384	19.0	6488	0.4263	5.2427
0.1472	20.0	6830	0.4309	5.2501
0.1389	21.0	7171	0.4309	5.0945
0.1456	22.0	7513	0.4327	5.2462
0.1398	23.0	7854	0.428	5.4476
0.1342	24.0	8196	0.4322	5.2605
0.1414	25.0	8537	0.4284	5.3590
0.1364	26.0	8879	0.4277	5.4423
0.1427	27.0	9220	0.4242	5.5243
0.1351	28.0	9562	0.4295	5.4508
0.1412	29.0	9903	0.4302	5.3767
0.1369	30.0	10245	0.4257	5.4378
0.1332	31.0	10586	0.4288	5.5004
0.14	32.0	10928	0.4261	5.6715
0.1336	33.0	11269	0.4268	5.5130
0.1412	34.0	11611	0.4266	5.5420
0.1357	35.0	11952	0.4182	5.6517
0.1363	36.0	12294	0.4208	5.4598
0.134	37.0	12635	0.4221	5.6220
0.1255	38.0	12977	0.4227	5.6988
0.1303	39.0	13318	0.4252	5.5511
0.2073	40.0	13660	0.4109	5.6976
0.1609	41.0	14001	0.4095	5.6908
0.1384	42.0	14343	0.4166	5.7460
0.1401	43.0	14684	0.4145	5.6377
0.1535	44.0	15026	0.4209	5.5295
0.1542	45.0	15367	0.2157	7.6505
7.7307	46.0	15709	0.2470	6.9279
7.3843	47.0	16050	0.1716	8.9680
8.5059	48.0	16392	0.1716	8.8324
7.9257	49.0	16733	0.1924	7.8902
7.855	49.93	17050	0.1945	8.0353

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.14.1

tyzhu
/

lmind_nq_train6000_eval6489_v1_docidx_v3_5e-4_lora2

lmind_nq_train6000_eval6489_v1_docidx_v3_5e-4_lora2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3_5e-4_lora2

Dataset used to train tyzhu/lmind_nq_train6000_eval6489_v1_docidx_v3_5e-4_lora2

Evaluation results