irenewds/shawgpt-ft

Browse files

Files changed (3) hide show

README.md +16 -21
runs/Oct24_19-16-48_1f4e1c060daf/events.out.tfevents.1729797412.1f4e1c060daf.3103.5 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.5273
 ## Model description
@@ -44,32 +44,27 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- num_epochs: 20
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.6269        | 0.9231  | 3    | 4.1312          |
-| 4.3723        | 1.8462  | 6    | 3.8586          |
-| 4.028         | 2.7692  | 9    | 3.5724          |
-| 2.7602        | 4.0     | 13   | 3.1993          |
-| 3.3802        | 4.9231  | 16   | 2.9506          |
-| 3.0875        | 5.8462  | 19   | 2.7413          |
-| 2.8505        | 6.7692  | 22   | 2.5576          |
-| 1.9712        | 8.0     | 26   | 2.3313          |
-| 2.423         | 8.9231  | 29   | 2.1863          |
-| 2.2168        | 9.8462  | 32   | 2.0370          |
-| 2.0654        | 10.7692 | 35   | 1.9127          |
-| 1.4253        | 12.0    | 39   | 1.7988          |
-| 1.8246        | 12.9231 | 42   | 1.7284          |
-| 1.7187        | 13.8462 | 45   | 1.6691          |
-| 1.6675        | 14.7692 | 48   | 1.6176          |
-| 1.2097        | 16.0    | 52   | 1.5665          |
-| 1.5569        | 16.9231 | 55   | 1.5437          |
-| 1.5205        | 17.8462 | 58   | 1.5306          |
-| 1.0627        | 18.4615 | 60   | 1.5273          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.4675
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 15
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.6055        | 0.9231  | 3    | 4.0311          |
+| 4.1731        | 1.8462  | 6    | 3.6064          |
+| 3.6829        | 2.7692  | 9    | 3.2058          |
+| 2.4342        | 4.0     | 13   | 2.7664          |
+| 2.8888        | 4.9231  | 16   | 2.4957          |
+| 2.5378        | 5.8462  | 19   | 2.2613          |
+| 2.2353        | 6.7692  | 22   | 2.0239          |
+| 1.4921        | 8.0     | 26   | 1.8148          |
+| 1.8263        | 8.9231  | 29   | 1.7127          |
+| 1.6847        | 9.8462  | 32   | 1.6285          |
+| 1.6188        | 10.7692 | 35   | 1.5605          |
+| 1.13          | 12.0    | 39   | 1.4999          |
+| 1.488         | 12.9231 | 42   | 1.4764          |
+| 1.2276        | 13.8462 | 45   | 1.4675          |
 ### Framework versions

runs/Oct24_19-16-48_1f4e1c060daf/events.out.tfevents.1729797412.1f4e1c060daf.3103.5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f0d7e59b9d5fdc1ae8a9ff50317d2317ac1eef9bab83f4352086fb3ac2a693cd
+size 12553

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:71fdfda02d0cfb458214911c060fb18b19dbe76ad69dcf1ca60828ea74ab0330
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:023476b94d79af97ff9f39f01d7facdda61b8aaaaff1a07986c51bf0f5a941a0
 size 5176