shawgpt-ft-rank15

Files changed (3) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4008
 ## Model description
@@ -44,23 +44,25 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 1.1392        | 0.9231 | 3    | 1.3117          |
-| 1.0798        | 1.8462 | 6    | 1.3205          |
-| 1.0056        | 2.7692 | 9    | 1.3275          |
-| 0.7115        | 4.0    | 13   | 1.3498          |
-| 0.9418        | 4.9231 | 16   | 1.3680          |
-| 0.8846        | 5.8462 | 19   | 1.3849          |
-| 0.8429        | 6.7692 | 22   | 1.3786          |
-| 0.6394        | 8.0    | 26   | 1.4015          |
-| 0.839         | 8.9231 | 29   | 1.4010          |
-| 0.6082        | 9.2308 | 30   | 1.4008          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.5116
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 12
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch   | Step | Validation Loss |
+|:-------------:|:-------:|:----:|:---------------:|
+| 1.0283        | 0.9231  | 3    | 1.3097          |
+| 0.9868        | 1.8462  | 6    | 1.3728          |
+| 0.9174        | 2.7692  | 9    | 1.3527          |
+| 0.6429        | 4.0     | 13   | 1.4026          |
+| 0.8442        | 4.9231  | 16   | 1.4004          |
+| 0.7766        | 5.8462  | 19   | 1.4248          |
+| 0.7289        | 6.7692  | 22   | 1.4474          |
+| 0.543         | 8.0     | 26   | 1.4707          |
+| 0.6976        | 8.9231  | 29   | 1.4967          |
+| 0.6682        | 9.8462  | 32   | 1.5151          |
+| 0.6726        | 10.7692 | 35   | 1.5118          |
+| 0.1014        | 11.0769 | 36   | 1.5116          |
 ### Framework versions

runs/Oct22_09-18-40_99867d27916d/events.out.tfevents.1729588721.99867d27916d.882.21 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:a6ad582d499d90bb028ec4db5a7fc11c410df478ecf7b097d4729519d5ab2eb2
+size 11627

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a55f68d288cfb970f0820ce3e2e26fdf6a5f0779e3f2b13c40d535bd3ee541bf
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:fdf0306eb9bfdf4a008770af7e841234ee4c6f463c0686303c05e7e1f62aaebe
 size 5176