tyzhu
/

squad_qa_baseline_v5_full_Qwen_Qwen1.5-4B_3e-5_lora

PEFT

Safetensors

Generated from Trainer

Model card Files Files and versions Community

tyzhu commited on Jun 4, 2024

Commit

d2019d8

verified ·

1 Parent(s): 58f615e

Model save

Browse files

Files changed (1) hide show

README.md +117 -0

README.md ADDED Viewed

	@@ -0,0 +1,117 @@

+---
+license: other
+base_model: Qwen/Qwen1.5-4B
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: squad_qa_baseline_v5_full_Qwen_Qwen1.5-4B_3e-5_lora
+  results: []
+library_name: peft
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# squad_qa_baseline_v5_full_Qwen_Qwen1.5-4B_3e-5_lora
+This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.8632
+- Accuracy: 0.5660
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch   | Step | Validation Loss | Accuracy |
+|:-------------:|:-------:|:----:|:---------------:|:--------:|
+| No log        | 0.9916  | 74   | 2.0550          | 0.5952   |
+| 2.3403        | 1.9966  | 149  | 2.0411          | 0.5933   |
+| 2.0198        | 2.9883  | 223  | 2.0403          | 0.5932   |
+| 2.0198        | 3.9933  | 298  | 2.0647          | 0.5922   |
+| 1.9239        | 4.9983  | 373  | 2.0999          | 0.5921   |
+| 1.7309        | 5.9899  | 447  | 2.1973          | 0.5879   |
+| 1.5254        | 6.9950  | 522  | 2.2753          | 0.5861   |
+| 1.5254        | 8.0     | 597  | 2.4079          | 0.5819   |
+| 1.2937        | 8.9916  | 671  | 2.5096          | 0.5775   |
+| 1.0409        | 9.9966  | 746  | 2.6079          | 0.5739   |
+| 0.8766        | 10.9883 | 820  | 2.7579          | 0.5718   |
+| 0.8766        | 11.9933 | 895  | 2.8722          | 0.5688   |
+| 0.721         | 12.9983 | 970  | 2.9797          | 0.5672   |
+| 0.6011        | 13.9899 | 1044 | 3.0708          | 0.5662   |
+| 0.5455        | 14.9950 | 1119 | 3.1660          | 0.5648   |
+| 0.5455        | 16.0    | 1194 | 3.2479          | 0.5650   |
+| 0.5003        | 16.9916 | 1268 | 3.2445          | 0.5655   |
+| 0.4683        | 17.9966 | 1343 | 3.2800          | 0.5638   |
+| 0.457         | 18.9883 | 1417 | 3.4280          | 0.5640   |
+| 0.457         | 19.9933 | 1492 | 3.4113          | 0.5662   |
+| 0.4441        | 20.9983 | 1567 | 3.4731          | 0.5637   |
+| 0.4327        | 21.9899 | 1641 | 3.5407          | 0.5639   |
+| 0.4308        | 22.9950 | 1716 | 3.4811          | 0.5640   |
+| 0.4308        | 24.0    | 1791 | 3.5854          | 0.5642   |
+| 0.4245        | 24.9916 | 1865 | 3.5206          | 0.5640   |
+| 0.416         | 25.9966 | 1940 | 3.6091          | 0.5638   |
+| 0.4173        | 26.9883 | 2014 | 3.5707          | 0.5643   |
+| 0.4173        | 27.9933 | 2089 | 3.6671          | 0.5648   |
+| 0.4117        | 28.9983 | 2164 | 3.6267          | 0.5631   |
+| 0.409         | 29.9899 | 2238 | 3.6658          | 0.5604   |
+| 0.4085        | 30.9950 | 2313 | 3.6984          | 0.5621   |
+| 0.4085        | 32.0    | 2388 | 3.6584          | 0.5660   |
+| 0.403         | 32.9916 | 2462 | 3.5848          | 0.5626   |
+| 0.404         | 33.9966 | 2537 | 3.6365          | 0.5631   |
+| 0.4013        | 34.9883 | 2611 | 3.7047          | 0.5647   |
+| 0.4013        | 35.9933 | 2686 | 3.7735          | 0.5643   |
+| 0.3987        | 36.9983 | 2761 | 3.6867          | 0.5657   |
+| 0.3951        | 37.9899 | 2835 | 3.7349          | 0.5662   |
+| 0.3971        | 38.9950 | 2910 | 3.7173          | 0.5643   |
+| 0.3971        | 40.0    | 2985 | 3.8004          | 0.5643   |
+| 0.3939        | 40.9916 | 3059 | 3.8041          | 0.5636   |
+| 0.3912        | 41.9966 | 3134 | 3.8263          | 0.5648   |
+| 0.3941        | 42.9883 | 3208 | 3.7954          | 0.5646   |
+| 0.3941        | 43.9933 | 3283 | 3.8001          | 0.5637   |
+| 0.3878        | 44.9983 | 3358 | 3.8438          | 0.5634   |
+| 0.3879        | 45.9899 | 3432 | 3.8626          | 0.5631   |
+| 0.3907        | 46.9950 | 3507 | 3.7882          | 0.5645   |
+| 0.3907        | 48.0    | 3582 | 3.8001          | 0.5622   |
+| 0.3864        | 48.9916 | 3656 | 3.7201          | 0.5609   |
+| 0.3871        | 49.5812 | 3700 | 3.8632          | 0.5660   |
+### Framework versions
+- PEFT 0.5.0
+- Transformers 4.40.2
+- Pytorch 2.3.0
+- Datasets 2.19.1
+- Tokenizers 0.19.1