tyzhu
/

lmind_hotpot_train8000_eval7405_v1_docidx_meta-llama_Llama-2-7b-hf_5e-5_lora2

Generated from Trainer

Model card Files Files and versions Community

tyzhu commited on Jun 6, 2024

Commit

485fb3f

·

verified ·

1 Parent(s): c332322

Model save

Files changed (1) hide show

README.md +85 -0

README.md ADDED Viewed

	@@ -0,0 +1,85 @@

+---
+license: llama2
+base_model: meta-llama/Llama-2-7b-hf
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: lmind_hotpot_train8000_eval7405_v1_docidx_meta-llama_Llama-2-7b-hf_5e-5_lora2
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# lmind_hotpot_train8000_eval7405_v1_docidx_meta-llama_Llama-2-7b-hf_5e-5_lora2
+This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.7747
+- Accuracy: 0.7938
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 20.0
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Accuracy |
+|:-------------:|:-----:|:-----:|:---------------:|:--------:|
+| 1.1109        | 1.0   | 839   | 1.3437          | 0.7529   |
+| 1.0723        | 2.0   | 1678  | 1.2988          | 0.7550   |
+| 1.0453        | 3.0   | 2517  | 1.2332          | 0.7577   |
+| 1.0099        | 4.0   | 3357  | 1.2146          | 0.7604   |
+| 0.9553        | 5.0   | 4196  | 1.1671          | 0.7632   |
+| 0.8876        | 6.0   | 5035  | 1.1263          | 0.7655   |
+| 0.8352        | 7.0   | 5874  | 1.0776          | 0.7681   |
+| 0.7872        | 8.0   | 6714  | 1.0745          | 0.7706   |
+| 0.7297        | 9.0   | 7553  | 1.0479          | 0.7730   |
+| 0.6831        | 10.0  | 8392  | 1.0078          | 0.7754   |
+| 0.6397        | 11.0  | 9231  | 0.9763          | 0.7779   |
+| 0.5885        | 12.0  | 10071 | 0.9702          | 0.7803   |
+| 0.5379        | 13.0  | 10910 | 0.9445          | 0.7824   |
+| 0.4996        | 14.0  | 11749 | 0.9087          | 0.7846   |
+| 0.464         | 15.0  | 12588 | 0.8827          | 0.7866   |
+| 0.4225        | 16.0  | 13428 | 0.8886          | 0.7881   |
+| 0.4259        | 17.0  | 14267 | 0.8224          | 0.7898   |
+| 0.361         | 18.0  | 15106 | 0.7985          | 0.7913   |
+| 0.3429        | 19.0  | 15945 | 0.7804          | 0.7930   |
+| 0.305         | 19.99 | 16780 | 0.7747          | 0.7938   |
+### Framework versions
+- Transformers 4.34.0
+- Pytorch 2.1.0+cu121
+- Datasets 2.18.0
+- Tokenizers 0.14.1