tyzhu
/

lmind_nq_train6000_eval6489_v1_docidx_v3_5e-4_lora2

Safetensors

Generated from Trainer

Eval Results

Model card Files Files and versions Community

tyzhu commited on Jun 8, 2024

Commit

a46e1a5

verified ·

1 Parent(s): f5e2951

Model save

Browse files

Files changed (1) hide show

README.md +117 -0

README.md ADDED Viewed

	@@ -0,0 +1,117 @@

+---
+license: other
+base_model: Qwen/Qwen1.5-4B
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: lmind_nq_train6000_eval6489_v1_docidx_v3_5e-4_lora2
+  results: []
+library_name: peft
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# lmind_nq_train6000_eval6489_v1_docidx_v3_5e-4_lora2
+This model is a fine-tuned version of [Qwen/Qwen1.5-4B](https://huggingface.co/Qwen/Qwen1.5-4B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 6.6284
+- Accuracy: 0.38
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0005
+- train_batch_size: 1
+- eval_batch_size: 2
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 32
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_ratio: 0.05
+- num_epochs: 50.0
+### Training results
+| Training Loss | Epoch   | Step  | Validation Loss | Accuracy |
+|:-------------:|:-------:|:-----:|:---------------:|:--------:|
+| 1.9636        | 0.9985  | 341   | 4.3673          | 0.4142   |
+| 1.7006        | 2.0     | 683   | 4.5869          | 0.4334   |
+| 1.3696        | 2.9985  | 1024  | 4.7205          | 0.4254   |
+| 1.0297        | 4.0     | 1366  | 4.7833          | 0.4172   |
+| 0.7991        | 4.9985  | 1707  | 4.9334          | 0.4169   |
+| 0.5801        | 6.0     | 2049  | 5.1964          | 0.4134   |
+| 0.4228        | 6.9985  | 2390  | 5.4040          | 0.4101   |
+| 0.3691        | 8.0     | 2732  | 5.6553          | 0.4098   |
+| 0.3052        | 8.9985  | 3073  | 5.5593          | 0.4122   |
+| 0.2993        | 10.0    | 3415  | 5.6595          | 0.4115   |
+| 0.2639        | 10.9985 | 3756  | 5.8112          | 0.4053   |
+| 0.2447        | 12.0    | 4098  | 5.8116          | 0.4063   |
+| 0.257         | 12.9985 | 4439  | 5.7970          | 0.4067   |
+| 0.2346        | 14.0    | 4781  | 5.7984          | 0.4010   |
+| 0.2477        | 14.9985 | 5122  | 5.9483          | 0.4007   |
+| 0.2341        | 16.0    | 5464  | 6.0840          | 0.3997   |
+| 0.2471        | 16.9985 | 5805  | 6.0255          | 0.3976   |
+| 0.2303        | 18.0    | 6147  | 5.9475          | 0.4012   |
+| 0.2165        | 18.9985 | 6488  | 6.3113          | 0.396    |
+| 0.2293        | 20.0    | 6830  | 6.1628          | 0.3935   |
+| 0.2193        | 20.9985 | 7171  | 6.3133          | 0.3883   |
+| 0.2313        | 22.0    | 7513  | 6.1371          | 0.3915   |
+| 0.217         | 22.9985 | 7854  | 6.0323          | 0.3932   |
+| 0.205         | 24.0    | 8196  | 6.2038          | 0.3892   |
+| 0.2208        | 24.9985 | 8537  | 6.0502          | 0.3894   |
+| 0.2102        | 26.0    | 8879  | 6.1540          | 0.3836   |
+| 0.2217        | 26.9985 | 9220  | 5.9979          | 0.388    |
+| 0.209         | 28.0    | 9562  | 6.2838          | 0.3872   |
+| 0.2206        | 28.9985 | 9903  | 6.1295          | 0.3867   |
+| 0.2109        | 30.0    | 10245 | 6.2467          | 0.3889   |
+| 0.1988        | 30.9985 | 10586 | 6.2880          | 0.3874   |
+| 0.2167        | 32.0    | 10928 | 6.2385          | 0.386    |
+| 0.2045        | 32.9985 | 11269 | 6.4127          | 0.3863   |
+| 0.2146        | 34.0    | 11611 | 6.3402          | 0.3849   |
+| 0.2049        | 34.9985 | 11952 | 6.3543          | 0.3872   |
+| 0.1954        | 36.0    | 12294 | 6.4192          | 0.3846   |
+| 0.2078        | 36.9985 | 12635 | 6.3592          | 0.3874   |
+| 0.1977        | 38.0    | 12977 | 6.5489          | 0.3876   |
+| 0.2094        | 38.9985 | 13318 | 6.3914          | 0.3903   |
+| 0.2012        | 40.0    | 13660 | 6.4228          | 0.3889   |
+| 0.2106        | 40.9985 | 14001 | 6.4559          | 0.3871   |
+| 0.2015        | 42.0    | 14343 | 6.3730          | 0.3823   |
+| 0.1921        | 42.9985 | 14684 | 6.3121          | 0.3826   |
+| 0.2019        | 44.0    | 15026 | 6.3081          | 0.3827   |
+| 0.1953        | 44.9985 | 15367 | 6.4581          | 0.3827   |
+| 0.2077        | 46.0    | 15709 | 6.6189          | 0.3801   |
+| 0.1997        | 46.9985 | 16050 | 6.4585          | 0.3835   |
+| 0.1899        | 48.0    | 16392 | 6.6852          | 0.3792   |
+| 0.1984        | 48.9985 | 16733 | 6.6309          | 0.3828   |
+| 0.1895        | 49.9268 | 17050 | 6.6284          | 0.38     |
+### Framework versions
+- PEFT 0.5.0
+- Transformers 4.41.1
+- Pytorch 2.1.0+cu121
+- Datasets 2.19.1
+- Tokenizers 0.19.1