shawgpt-ft-lr0.0002-wd0.001

Files changed (4) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.3126
 ## Model description
@@ -51,18 +51,18 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 3.0381        | 0.9231  | 3    | 2.4992          |
-| 2.3477        | 1.8462  | 6    | 1.9579          |
-| 1.8587        | 2.7692  | 9    | 1.6834          |
-| 1.1848        | 4.0     | 13   | 1.4879          |
-| 1.4757        | 4.9231  | 16   | 1.4086          |
-| 1.3626        | 5.8462  | 19   | 1.3707          |
-| 1.3014        | 6.7692  | 22   | 1.3461          |
-| 0.9799        | 8.0     | 26   | 1.3250          |
-| 1.279         | 8.9231  | 29   | 1.3174          |
-| 1.2422        | 9.8462  | 32   | 1.3141          |
-| 1.2599        | 10.7692 | 35   | 1.3127          |
-| 0.2131        | 11.0769 | 36   | 1.3126          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.7845
 ## Model description
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.6121        | 0.9231  | 3    | 4.0587          |
+| 4.2126        | 1.8462  | 6    | 3.6493          |
+| 3.7297        | 2.7692  | 9    | 3.2522          |
+| 2.4733        | 4.0     | 13   | 2.8156          |
+| 2.9598        | 4.9231  | 16   | 2.5689          |
+| 2.6551        | 5.8462  | 19   | 2.3614          |
+| 2.3985        | 6.7692  | 22   | 2.1826          |
+| 1.6623        | 8.0     | 26   | 1.9991          |
+| 2.0403        | 8.9231  | 29   | 1.8808          |
+| 1.9001        | 9.8462  | 32   | 1.8158          |
+| 1.8688        | 10.7692 | 35   | 1.7877          |
+| 0.3703        | 11.0769 | 36   | 1.7845          |
 ### Framework versions

runs/Oct22_10-26-50_99867d27916d/events.out.tfevents.1729592811.99867d27916d.882.38 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b421011bd118628bde1250243cde17f8917071b40b898c246a3a9bd45593eeb
+size 10364

runs/Oct22_11-37-15_99867d27916d/events.out.tfevents.1729597035.99867d27916d.882.62 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:696d65c92bb9e716a91639f8d0c4c2b53dea1f858f3701b892413ff57075cfe7
+size 11658

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:27e768fc2c94b170a238402db2ffb74fd9ac14e61d43447c0c76e5da983eae3a
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:a2f70c555ef2459170c1ee079e591abdbb5147fcaa9a6362402e35d11a28f531
 size 5240