tyzhu
/

squad_qa_title_v5_full_add3_meta-llama_Llama-2-7b-hf_1e-4_lora

Generated from Trainer

Model card Files Files and versions Community

tyzhu commited on Jun 5, 2024

Commit

2fa6059

verified ·

1 Parent(s): 79522b3

Model save

Browse files

Files changed (1) hide show

README.md +115 -0

README.md ADDED Viewed

	@@ -0,0 +1,115 @@

+---
+license: llama2
+base_model: meta-llama/Llama-2-7b-hf
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: squad_qa_title_v5_full_add3_meta-llama_Llama-2-7b-hf_1e-4_lora
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# squad_qa_title_v5_full_add3_meta-llama_Llama-2-7b-hf_1e-4_lora
+This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.0766
+- Accuracy: 0.6869
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch | Step | Accuracy | Validation Loss |
+|:-------------:|:-----:|:----:|:--------:|:---------------:|
+| 1.4665        | 1.0   | 158  | 0.7027   | 1.1937          |
+| 0.8192        | 2.0   | 317  | 0.7082   | 1.1604          |
+| 0.5795        | 2.99  | 475  | 0.7050   | 1.1993          |
+| 0.3605        | 4.0   | 634  | 0.7037   | 1.2827          |
+| 0.2894        | 5.0   | 793  | 0.7010   | 1.3732          |
+| 0.2089        | 6.0   | 951  | 0.6985   | 1.4418          |
+| 0.1866        | 7.0   | 1110 | 0.7002   | 1.4958          |
+| 0.168         | 8.0   | 1269 | 0.6973   | 1.5733          |
+| 0.1627        | 9.0   | 1427 | 0.6966   | 1.6454          |
+| 0.1549        | 10.0  | 1586 | 0.6985   | 1.6570          |
+| 0.1497        | 10.99 | 1744 | 0.6940   | 1.7429          |
+| 0.1534        | 12.0  | 1903 | 0.6982   | 1.7459          |
+| 0.1444        | 13.0  | 2062 | 0.6955   | 1.7857          |
+| 0.148         | 14.0  | 2220 | 0.6954   | 1.7621          |
+| 0.1462        | 15.0  | 2379 | 0.6957   | 1.7651          |
+| 0.1464        | 16.0  | 2538 | 0.6962   | 1.7384          |
+| 0.1405        | 17.0  | 2696 | 0.6935   | 1.8738          |
+| 0.1394        | 18.0  | 2855 | 0.6941   | 1.8427          |
+| 0.1445        | 18.99 | 3013 | 0.6937   | 1.7709          |
+| 0.1389        | 20.0  | 3172 | 0.6934   | 1.8840          |
+| 0.1413        | 21.0  | 3331 | 0.6948   | 1.8034          |
+| 0.142         | 22.0  | 3489 | 0.6893   | 1.8046          |
+| 0.1421        | 23.0  | 3648 | 0.6882   | 1.8369          |
+| 0.144         | 24.0  | 3807 | 0.6858   | 1.8879          |
+| 0.1348        | 25.0  | 3965 | 0.69     | 1.8530          |
+| 0.138         | 26.0  | 4124 | 0.6905   | 1.8132          |
+| 0.138         | 26.99 | 4282 | 0.6858   | 1.9304          |
+| 0.137         | 28.0  | 4441 | 0.6877   | 1.9670          |
+| 0.14          | 29.0  | 4600 | 0.6856   | 1.9993          |
+| 0.1337        | 30.0  | 4758 | 0.6848   | 1.8712          |
+| 0.1373        | 31.0  | 4917 | 0.6870   | 1.8732          |
+| 0.134         | 32.0  | 5076 | 0.6862   | 1.9648          |
+| 0.1363        | 33.0  | 5234 | 0.6872   | 1.9204          |
+| 0.1365        | 34.0  | 5393 | 0.6854   | 1.9778          |
+| 0.135         | 34.99 | 5551 | 0.6840   | 1.9516          |
+| 0.1355        | 36.0  | 5710 | 0.6841   | 2.0177          |
+| 0.1343        | 37.0  | 5869 | 0.6852   | 2.0255          |
+| 0.1321        | 38.0  | 6027 | 0.6843   | 1.9995          |
+| 0.1313        | 39.0  | 6162 | 1.9035   | 0.6838          |
+| 0.1361        | 40.0  | 6321 | 1.9624   | 0.6850          |
+| 0.1345        | 40.99 | 6479 | 1.9221   | 0.6861          |
+| 0.1353        | 42.0  | 6638 | 2.0262   | 0.6841          |
+| 0.1312        | 43.0  | 6797 | 1.9510   | 0.6859          |
+| 0.1313        | 44.0  | 6955 | 2.0107   | 0.6845          |
+| 0.13          | 45.0  | 7114 | 1.9279   | 0.6870          |
+| 0.1311        | 46.0  | 7273 | 1.9542   | 0.6878          |
+| 0.1326        | 47.0  | 7431 | 2.0657   | 0.6845          |
+| 0.1292        | 48.0  | 7590 | 1.9569   | 0.6854          |
+| 0.1315        | 48.99 | 7748 | 1.8985   | 0.6879          |
+| 0.1341        | 49.95 | 7900 | 2.0766   | 0.6869          |
+### Framework versions
+- Transformers 4.34.0
+- Pytorch 2.1.0+cu121
+- Datasets 2.18.0
+- Tokenizers 0.14.1