lmind_hotpot_train8000_eval7405_v1_qa_5e-5_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_hotpot_train8000_eval7405_v1_qa dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6692
  • Accuracy: 0.5849

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
1.798 1.0 250 0.6067 1.8213
1.7 2.0 500 0.6077 1.8046
1.5869 3.0 750 0.6071 1.8293
1.4349 4.0 1000 0.6043 1.8974
1.3111 5.0 1250 0.6015 1.9769
1.197 6.0 1500 0.5992 2.0635
1.0729 7.0 1750 0.5975 2.1523
0.9833 8.0 2000 0.5947 2.2640
0.8672 9.0 2250 0.5924 2.3643
0.7883 10.0 2500 0.5908 2.4598
0.6879 11.0 2750 0.5890 2.5669
0.6295 12.0 3000 0.5885 2.7000
0.5545 13.0 3250 0.5851 2.8281
0.5208 14.0 3500 0.5853 2.8794
0.4679 15.0 3750 0.5863 2.9184
0.4464 16.0 4000 0.5852 3.0791
0.4136 17.0 4250 0.5856 3.0832
0.4021 18.0 4500 0.5847 3.0944
0.3776 19.0 4750 0.5828 3.2120
0.373 20.0 5000 0.5839 3.2298
0.3572 21.0 5250 0.5841 3.2434
0.3517 22.0 5500 0.5847 3.2606
0.3374 23.0 5750 0.5845 3.3392
0.3338 24.0 6000 0.5841 3.3489
0.3286 25.0 6250 0.5846 3.4036
0.3259 26.0 6500 0.5849 3.3878
0.3175 27.0 6750 0.5853 3.4960
0.3185 28.0 7000 0.5852 3.4873
0.3117 29.0 7250 0.5840 3.4780
0.3125 30.0 7500 0.5836 3.5383
0.3041 31.0 7750 0.5841 3.5253
0.3047 32.0 8000 0.5853 3.5283
0.2982 33.0 8250 0.5833 3.5511
0.3013 34.0 8500 0.5852 3.5445
0.295 35.0 8750 0.5841 3.5891
0.2988 36.0 9000 0.5833 3.6198
0.2939 37.0 9250 0.5842 3.5708
0.2952 38.0 9500 0.5833 3.6124
0.2927 39.0 9750 0.5840 3.6413
0.2931 40.0 10000 0.5828 3.6555
0.2891 41.0 10250 0.5841 3.6471
0.291 42.0 10500 0.5846 3.7233
0.2886 43.0 10750 0.5850 3.6348
0.289 44.0 11000 0.5839 3.6786
0.2846 45.0 11250 0.5845 3.6846
0.2858 46.0 11500 0.5855 3.7088
0.283 47.0 11750 0.5842 3.6938
0.2863 48.0 12000 0.5830 3.6793
0.2782 49.0 12250 0.5839 3.6805
0.2834 50.0 12500 0.5849 3.6692

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for tyzhu/lmind_hotpot_train8000_eval7405_v1_qa_5e-5_lora2

Finetuned
(619)
this model

Dataset used to train tyzhu/lmind_hotpot_train8000_eval7405_v1_qa_5e-5_lora2

Evaluation results

  • Accuracy on tyzhu/lmind_hotpot_train8000_eval7405_v1_qa
    self-reported
    0.585