tyzhu
/

lmind_nq_train6000_eval6489_v1_docidx_v3_3e-5_lora2

Safetensors

Generated from Trainer

Model card Files Files and versions Community

tyzhu commited on Jun 8, 2024

Commit

6afaf7f

verified ·

1 Parent(s): 510f342

Model save

Browse files

Files changed (1) hide show

README.md +117 -0

README.md ADDED Viewed

	@@ -0,0 +1,117 @@

+---
+license: other
+base_model: Qwen/Qwen1.5-4B
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: lmind_nq_train6000_eval6489_v1_docidx_v3_3e-5_lora2
+  results: []
+library_name: peft
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# lmind_nq_train6000_eval6489_v1_docidx_v3_3e-5_lora2
+This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 6.3965
+- Accuracy: 0.4106
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch   | Step  | Validation Loss | Accuracy |
+|:-------------:|:-------:|:-----:|:---------------:|:--------:|
+| 1.9686        | 0.9985  | 341   | 3.6919          | 0.4374   |
+| 1.9337        | 2.0     | 683   | 3.7485          | 0.4476   |
+| 1.9033        | 2.9985  | 1024  | 3.8826          | 0.4496   |
+| 1.857         | 4.0     | 1366  | 3.9701          | 0.4481   |
+| 1.8042        | 4.9985  | 1707  | 4.1171          | 0.4473   |
+| 1.7443        | 6.0     | 2049  | 4.1837          | 0.4470   |
+| 1.7019        | 6.9985  | 2390  | 4.2604          | 0.4462   |
+| 1.6305        | 8.0     | 2732  | 4.4065          | 0.4415   |
+| 1.6056        | 8.9985  | 3073  | 4.4487          | 0.4398   |
+| 1.5521        | 10.0    | 3415  | 4.5474          | 0.4389   |
+| 1.4934        | 10.9985 | 3756  | 4.5898          | 0.4367   |
+| 1.4287        | 12.0    | 4098  | 4.6911          | 0.4355   |
+| 1.3846        | 12.9985 | 4439  | 4.7629          | 0.4355   |
+| 1.3185        | 14.0    | 4781  | 4.7585          | 0.4328   |
+| 1.2667        | 14.9985 | 5122  | 4.9389          | 0.4309   |
+| 1.2144        | 16.0    | 5464  | 4.8987          | 0.4303   |
+| 1.1708        | 16.9985 | 5805  | 5.0017          | 0.4297   |
+| 1.1146        | 18.0    | 6147  | 4.9778          | 0.4307   |
+| 1.0531        | 18.9985 | 6488  | 5.1216          | 0.4287   |
+| 1.0158        | 20.0    | 6830  | 5.1210          | 0.4273   |
+| 0.9555        | 20.9985 | 7171  | 5.1988          | 0.4293   |
+| 0.9205        | 22.0    | 7513  | 5.2240          | 0.4270   |
+| 0.8711        | 22.9985 | 7854  | 5.3467          | 0.4251   |
+| 0.8082        | 24.0    | 8196  | 5.3555          | 0.4243   |
+| 0.7854        | 24.9985 | 8537  | 5.4629          | 0.4241   |
+| 0.7359        | 26.0    | 8879  | 5.4699          | 0.4231   |
+| 0.7002        | 26.9985 | 9220  | 5.5053          | 0.4211   |
+| 0.6684        | 28.0    | 9562  | 5.5584          | 0.4214   |
+| 0.6236        | 28.9985 | 9903  | 5.6117          | 0.4196   |
+| 0.5866        | 30.0    | 10245 | 5.5696          | 0.4196   |
+| 0.5554        | 30.9985 | 10586 | 5.6579          | 0.4184   |
+| 0.5291        | 32.0    | 10928 | 5.7396          | 0.4178   |
+| 0.4968        | 32.9985 | 11269 | 5.8110          | 0.4181   |
+| 0.4635        | 34.0    | 11611 | 5.8719          | 0.4167   |
+| 0.4465        | 34.9985 | 11952 | 5.8658          | 0.4162   |
+| 0.4184        | 36.0    | 12294 | 5.8887          | 0.4147   |
+| 0.4002        | 36.9985 | 12635 | 5.9950          | 0.4165   |
+| 0.3716        | 38.0    | 12977 | 5.9991          | 0.4155   |
+| 0.3537        | 38.9985 | 13318 | 6.0723          | 0.4151   |
+| 0.3361        | 40.0    | 13660 | 6.0777          | 0.4127   |
+| 0.3199        | 40.9985 | 14001 | 6.1181          | 0.4150   |
+| 0.303         | 42.0    | 14343 | 6.0911          | 0.4134   |
+| 0.2797        | 42.9985 | 14684 | 6.1607          | 0.4145   |
+| 0.2762        | 44.0    | 15026 | 6.1128          | 0.4126   |
+| 0.2633        | 44.9985 | 15367 | 6.1446          | 0.4127   |
+| 0.2508        | 46.0    | 15709 | 6.2330          | 0.4134   |
+| 0.2397        | 46.9985 | 16050 | 6.2369          | 0.4125   |
+| 0.2259        | 48.0    | 16392 | 6.2775          | 0.4142   |
+| 0.2228        | 48.9985 | 16733 | 6.2132          | 0.4128   |
+| 0.2098        | 49.9268 | 17050 | 6.3965          | 0.4106   |
+### Framework versions
+- PEFT 0.5.0
+- Transformers 4.41.1
+- Pytorch 2.1.0+cu121
+- Datasets 2.19.1
+- Tokenizers 0.19.1