shawgpt-ft-lr2e-05-wd0.01

Files changed (3) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.8444
 ## Model description
@@ -51,18 +51,18 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.6427        | 0.9231  | 3    | 4.2081          |
-| 4.5689        | 1.8462  | 6    | 4.1561          |
-| 4.5192        | 2.7692  | 9    | 4.1061          |
-| 3.3371        | 4.0     | 13   | 4.0411          |
-| 4.4042        | 4.9231  | 16   | 3.9947          |
-| 4.328         | 5.8462  | 19   | 3.9542          |
-| 4.2913        | 6.7692  | 22   | 3.9199          |
-| 3.168         | 8.0     | 26   | 3.8843          |
-| 4.199         | 8.9231  | 29   | 3.8648          |
-| 4.1898        | 9.8462  | 32   | 3.8517          |
-| 4.1757        | 10.7692 | 35   | 3.8450          |
-| 0.9827        | 11.0769 | 36   | 3.8444          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.9690
 ## Model description
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.6447        | 0.9231  | 3    | 4.2183          |
+| 4.5908        | 1.8462  | 6    | 4.1843          |
+| 4.5642        | 2.7692  | 9    | 4.1518          |
+| 3.3876        | 4.0     | 13   | 4.1100          |
+| 4.4932        | 4.9231  | 16   | 4.0795          |
+| 4.4358        | 5.8462  | 19   | 4.0517          |
+| 4.4138        | 6.7692  | 22   | 4.0268          |
+| 3.2645        | 8.0     | 26   | 3.9996          |
+| 4.335         | 8.9231  | 29   | 3.9846          |
+| 4.3323        | 9.8462  | 32   | 3.9746          |
+| 4.3194        | 10.7692 | 35   | 3.9696          |
+| 1.0224        | 11.0769 | 36   | 3.9690          |
 ### Framework versions

runs/Oct22_11-32-59_99867d27916d/events.out.tfevents.1729596779.99867d27916d.882.60 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:4e2fb7ede4754e84e4346085541cc4f19d34b49018fa3dabc108d9d55bd48d3e
+size 11650

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:26d7e6afbc156d3f61cce15c18c5e41fd91544c8caa094bf7e0c4ec3e22248c5
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:61a535b8886a8be8bb6bdf938ffb858c39c1a8ecf22dc5470de16370ba0682ca
 size 5240