shawgpt-ft-rank15

Files changed (3) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4906
 ## Model description
@@ -44,25 +44,27 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- num_epochs: 12
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.5923        | 0.9231  | 3    | 3.9658          |
-| 4.0441        | 1.8462  | 6    | 3.4365          |
-| 3.451         | 2.7692  | 9    | 2.9600          |
-| 2.2208        | 4.0     | 13   | 2.4960          |
-| 2.5836        | 4.9231  | 16   | 2.2153          |
-| 2.2284        | 5.8462  | 19   | 1.9993          |
-| 1.9357        | 6.7692  | 22   | 1.7904          |
-| 1.3278        | 8.0     | 26   | 1.6449          |
-| 1.6406        | 8.9231  | 29   | 1.5613          |
-| 1.5307        | 9.8462  | 32   | 1.5156          |
-| 1.5151        | 10.7692 | 35   | 1.4931          |
-| 0.2742        | 11.0769 | 36   | 1.4906          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.3663
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 15
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.5909        | 0.9231  | 3    | 3.9607          |
+| 4.0305        | 1.8462  | 6    | 3.4190          |
+| 3.4227        | 2.7692  | 9    | 2.9340          |
+| 2.2008        | 4.0     | 13   | 2.4790          |
+| 2.5483        | 4.9231  | 16   | 2.1770          |
+| 2.1551        | 5.8462  | 19   | 1.9264          |
+| 1.8321        | 6.7692  | 22   | 1.7147          |
+| 1.2386        | 8.0     | 26   | 1.5355          |
+| 1.4952        | 8.9231  | 29   | 1.4535          |
+| 1.3923        | 9.8462  | 32   | 1.4129          |
+| 1.3843        | 10.7692 | 35   | 1.3921          |
+| 0.9917        | 12.0    | 39   | 1.3751          |
+| 1.3332        | 12.9231 | 42   | 1.3687          |
+| 1.1067        | 13.8462 | 45   | 1.3663          |
 ### Framework versions

runs/Oct22_10-55-18_99867d27916d/events.out.tfevents.1729594518.99867d27916d.882.47 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:a5e280f766d647eb9ff21043a19a9b648c8c3b97c26c95b2cb0b7037368365b7
+size 12573

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fe2bd0787ce2d2c3bae22e838f391bf15e085d1ecb681083acb22ebedbc2fb38
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:997157d142991aa78774e8182a1da5eaaafed179f219a2d23230373db795e035
 size 5176