tyzhu
/

squad_qa_no_id_v5_full_meta-llama_Llama-2-7b-hf_1e-4_lora

Generated from Trainer

Model card Files Files and versions Community

tyzhu commited on Jun 5, 2024

Commit

ed8af2f

•

1 Parent(s): 644302d

Model save

Browse files

Files changed (1) hide show

README.md +115 -0

README.md ADDED Viewed

	@@ -0,0 +1,115 @@

+---
+license: llama2
+base_model: meta-llama/Llama-2-7b-hf
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: squad_qa_no_id_v5_full_meta-llama_Llama-2-7b-hf_1e-4_lora
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# squad_qa_no_id_v5_full_meta-llama_Llama-2-7b-hf_1e-4_lora
+This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.9829
+- Accuracy: 0.6116
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 1.6368        | 1.0   | 158  | 1.6276          | 0.624    |
+| 1.0335        | 2.0   | 317  | 1.6646          | 0.6236   |
+| 0.7775        | 3.0   | 475  | 1.7627          | 0.6207   |
+| 0.49          | 4.0   | 634  | 1.9259          | 0.6165   |
+| 0.3885        | 5.0   | 792  | 2.0775          | 0.6135   |
+| 0.2821        | 6.0   | 951  | 2.2256          | 0.6115   |
+| 0.2484        | 7.0   | 1109 | 2.3241          | 0.6106   |
+| 0.2254        | 8.0   | 1268 | 2.3944          | 0.6104   |
+| 0.217         | 9.0   | 1426 | 2.5084          | 0.6102   |
+| 0.2063        | 10.0  | 1585 | 2.5607          | 0.6095   |
+| 0.2004        | 11.0  | 1743 | 2.6099          | 0.6084   |
+| 0.2001        | 12.0  | 1902 | 2.6986          | 0.6072   |
+| 0.1887        | 13.0  | 2060 | 2.7109          | 0.6086   |
+| 0.198         | 14.0  | 2219 | 2.6802          | 0.6098   |
+| 0.1898        | 15.0  | 2377 | 2.6273          | 0.6088   |
+| 0.1956        | 16.0  | 2536 | 2.7269          | 0.6092   |
+| 0.1857        | 17.0  | 2694 | 2.6266          | 0.6112   |
+| 0.1879        | 18.0  | 2853 | 2.6884          | 0.6095   |
+| 0.1927        | 19.0  | 3011 | 2.6882          | 0.6104   |
+| 0.1829        | 20.0  | 3170 | 2.6503          | 0.6105   |
+| 0.1902        | 21.0  | 3328 | 2.7196          | 0.6096   |
+| 0.187         | 22.0  | 3487 | 2.5676          | 0.6096   |
+| 0.1832        | 23.0  | 3645 | 2.7033          | 0.6087   |
+| 0.1925        | 24.0  | 3804 | 2.7632          | 0.6076   |
+| 0.1812        | 25.0  | 3962 | 2.8529          | 0.6066   |
+| 0.1862        | 26.0  | 4121 | 2.7649          | 0.6078   |
+| 0.1841        | 27.0  | 4279 | 2.7719          | 0.6101   |
+| 0.1835        | 28.0  | 4438 | 2.8693          | 0.6087   |
+| 0.1807        | 29.0  | 4596 | 2.9207          | 0.6081   |
+| 0.1773        | 30.0  | 4755 | 2.9250          | 0.6078   |
+| 0.1808        | 31.0  | 4913 | 3.0189          | 0.6090   |
+| 0.1775        | 32.0  | 5072 | 3.0751          | 0.6085   |
+| 0.1775        | 33.0  | 5230 | 3.0890          | 0.6076   |
+| 0.1802        | 34.0  | 5389 | 3.1098          | 0.608    |
+| 0.1794        | 35.0  | 5547 | 3.0633          | 0.6092   |
+| 0.1798        | 36.0  | 5706 | 3.2008          | 0.6068   |
+| 0.1764        | 37.0  | 5864 | 3.1595          | 0.6084   |
+| 0.1794        | 38.0  | 6023 | 2.7637          | 0.6092   |
+| 0.1938        | 39.0  | 6181 | 2.6485          | 0.6093   |
+| 0.1898        | 40.0  | 6340 | 2.7094          | 0.6082   |
+| 0.1912        | 41.0  | 6498 | 2.6494          | 0.6113   |
+| 0.1839        | 42.0  | 6657 | 2.7422          | 0.6103   |
+| 0.1815        | 43.0  | 6815 | 2.7747          | 0.6102   |
+| 0.1725        | 44.0  | 6974 | 2.8100          | 0.6104   |
+| 0.1754        | 45.0  | 7132 | 2.9507          | 0.6105   |
+| 0.1745        | 46.0  | 7291 | 2.9690          | 0.6107   |
+| 0.1758        | 47.0  | 7449 | 2.9188          | 0.6113   |
+| 0.1793        | 48.0  | 7608 | 2.8621          | 0.6125   |
+| 0.1729        | 49.0  | 7766 | 2.9604          | 0.6126   |
+| 0.1793        | 49.84 | 7900 | 2.9829          | 0.6116   |
+### Framework versions
+- Transformers 4.34.0
+- Pytorch 2.1.0+cu121
+- Datasets 2.18.0
+- Tokenizers 0.14.1