tyzhu's picture
End of training
5e8c427 verified
metadata
license: llama2
base_model: meta-llama/Llama-2-7b-hf
tags:
  - generated_from_trainer
datasets:
  - tyzhu/lmind_nq_train6000_eval6489_v1_qa
metrics:
  - accuracy
model-index:
  - name: lmind_nq_train6000_eval6489_v1_qa_1e-4_lora2
    results:
      - task:
          name: Causal Language Modeling
          type: text-generation
        dataset:
          name: tyzhu/lmind_nq_train6000_eval6489_v1_qa
          type: tyzhu/lmind_nq_train6000_eval6489_v1_qa
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.6010769230769231

lmind_nq_train6000_eval6489_v1_qa_1e-4_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_qa dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0414
  • Accuracy: 0.6011

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
1.598 1.0 187 0.6147 1.2692
1.1923 2.0 375 0.6176 1.2733
0.9732 3.0 562 0.6136 1.3396
0.7763 4.0 750 0.6104 1.4358
0.6498 5.0 937 0.6052 1.5630
0.57 6.0 1125 0.6031 1.6599
0.5253 7.0 1312 0.6027 1.7480
0.4958 8.0 1500 0.6021 1.8060
0.4521 9.0 1687 0.6013 1.8599
0.443 10.0 1875 0.6013 1.9468
0.439 11.0 2062 0.6015 1.9500
0.433 12.0 2250 0.6021 1.9104
0.4323 13.0 2437 0.6001 2.0079
0.4281 14.0 2625 0.6008 1.9881
0.4277 15.0 2812 0.6005 2.0305
0.4298 16.0 3000 0.6005 2.0478
0.4082 17.0 3187 0.6007 2.0539
0.411 18.0 3375 0.6005 2.0314
0.4113 19.0 3562 0.6011 2.0368
0.4121 20.0 3750 0.6017 2.1022
0.414 21.0 3937 0.6007 2.0512
0.4163 22.0 4125 0.6016 2.1147
0.4172 23.0 4312 0.6007 2.0942
0.4156 24.0 4500 0.6008 2.1201
0.3997 25.0 4687 0.6010 2.0660
0.3994 26.0 4875 0.6006 2.0832
0.4032 27.0 5062 0.6003 2.1423
0.4058 28.0 5250 0.6015 2.1000
0.4065 29.0 5437 0.6009 2.1065
0.4068 30.0 5625 0.6006 2.1389
0.4091 31.0 5812 0.6005 2.1241
0.4103 32.0 6000 0.6010 2.1241
0.3959 33.0 6187 0.6021 2.1206
0.3974 34.0 6375 0.6017 2.1061
0.3983 35.0 6562 0.6013 2.1041
0.4034 36.0 6750 0.6017 2.0843
0.4035 37.0 6937 0.6035 2.0837
0.4013 38.0 7125 0.6015 2.1708
0.4063 39.0 7312 0.602 2.0946
0.4049 40.0 7500 0.6019 2.1671
0.391 41.0 7687 0.6026 2.1508
0.3913 42.0 7875 0.5998 2.2062
0.3945 43.0 8062 0.6012 2.2214
0.3953 44.0 8250 0.6005 2.2576
0.3959 45.0 8437 0.6001 2.2755
0.3961 46.0 8625 0.6014 2.3085
0.3982 47.0 8812 0.5992 2.3093
0.4028 48.0 9000 0.6007 2.1926
0.3915 49.0 9187 0.6018 2.0674
0.4009 49.87 9350 0.6011 2.0414

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.14.1