End of training

Files changed (5) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6278
 ## Model description
@@ -42,7 +42,7 @@ The following hyperparameters were used during training:
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 200
 - num_epochs: 4
@@ -50,18 +50,18 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.9423        | 0.33  | 9    | 1.0697          |
-| 1.0127        | 0.67  | 18   | 1.0604          |
-| 0.9779        | 1.0   | 27   | 1.0315          |
-| 0.9518        | 1.33  | 36   | 0.9850          |
-| 0.8728        | 1.67  | 45   | 0.9434          |
-| 0.8215        | 2.0   | 54   | 0.8961          |
-| 0.8151        | 2.33  | 63   | 0.8434          |
-| 0.7027        | 2.67  | 72   | 0.7606          |
-| 0.6277        | 3.0   | 81   | 0.6863          |
-| 0.5266        | 3.33  | 90   | 0.6582          |
-| 0.5282        | 3.67  | 99   | 0.6403          |
-| 0.5709        | 4.0   | 108  | 0.6278          |
 ### Framework versions

 This model is a fine-tuned version of [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5958
 ## Model description
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: reduce_lr_on_plateau
 - lr_scheduler_warmup_steps: 200
 - num_epochs: 4
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.9184        | 0.33  | 9    | 0.9121          |
+| 0.8543        | 0.67  | 18   | 0.7775          |
+| 0.712         | 1.0   | 27   | 0.6491          |
+| 0.6182        | 1.33  | 36   | 0.6139          |
+| 0.5267        | 1.67  | 45   | 0.5951          |
+| 0.5141        | 2.0   | 54   | 0.5840          |
+| 0.454         | 2.33  | 63   | 0.5961          |
+| 0.4618        | 2.67  | 72   | 0.5944          |
+| 0.4989        | 3.0   | 81   | 0.5944          |
+| 0.3703        | 3.33  | 90   | 0.6079          |
+| 0.4162        | 3.67  | 99   | 0.5984          |
+| 0.4135        | 4.0   | 108  | 0.5958          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,10 +19,10 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "k_proj",
-    "v_proj",
     "o_proj",
-    "q_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "q_proj",
     "k_proj",
     "o_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1aa957a0bf0ff31f1cbde437d83345dceff042d86fb5606bae12b908eec27844
 size 13665592

 version https://git-lfs.github.com/spec/v1
+oid sha256:85184fa9f94b7b7dc26fe233479247c3838e968bc3cb7d6018068aceba964cc4
 size 13665592

tokenizer.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 3600,
     "strategy": "LongestFirst",
     "stride": 0
   },

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 3800,
     "strategy": "LongestFirst",
     "stride": 0
   },

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c05f7c31581197c870dc7944504b6e7211f9d859692dba036917f09c3113bc41
-size 4664

 version https://git-lfs.github.com/spec/v1
+oid sha256:91b70e9eb938f894f98dacb706ece473da02042e963f4972526e8a1d0dbd1d25
+size 4792