irenewds/shawgpt-ft

Browse files

Files changed (4) hide show

README.md +21 -21
runs/Oct24_19-05-18_1f4e1c060daf/events.out.tfevents.1729796719.1f4e1c060daf.3103.2 +3 -0
runs/Oct24_19-05-33_1f4e1c060daf/events.out.tfevents.1729796735.1f4e1c060daf.3103.3 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4951
 ## Model description
@@ -35,7 +35,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.00015
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
@@ -51,25 +51,25 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.626         | 0.9231  | 3    | 4.1261          |
-| 4.3644        | 1.8462  | 6    | 3.8475          |
-| 4.0214        | 2.7692  | 9    | 3.5611          |
-| 2.7572        | 4.0     | 13   | 3.1852          |
-| 3.3746        | 4.9231  | 16   | 2.9326          |
-| 3.0744        | 5.8462  | 19   | 2.7162          |
-| 2.8321        | 6.7692  | 22   | 2.5310          |
-| 1.9575        | 8.0     | 26   | 2.3037          |
-| 2.3937        | 8.9231  | 29   | 2.1512          |
-| 2.1841        | 9.8462  | 32   | 2.0072          |
-| 2.0372        | 10.7692 | 35   | 1.8823          |
-| 1.4009        | 12.0    | 39   | 1.7594          |
-| 1.7863        | 12.9231 | 42   | 1.6881          |
-| 1.6787        | 13.8462 | 45   | 1.6275          |
-| 1.6265        | 14.7692 | 48   | 1.5777          |
-| 1.1833        | 16.0    | 52   | 1.5324          |
-| 1.5237        | 16.9231 | 55   | 1.5103          |
-| 1.4914        | 17.8462 | 58   | 1.4982          |
-| 1.0442        | 18.4615 | 60   | 1.4951          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.3735
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0002
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.6194        | 0.9231  | 3    | 4.0943          |
+| 4.2766        | 1.8462  | 6    | 3.7269          |
+| 3.8243        | 2.7692  | 9    | 3.3448          |
+| 2.5415        | 4.0     | 13   | 2.8882          |
+| 3.0323        | 4.9231  | 16   | 2.6203          |
+| 2.6876        | 5.8462  | 19   | 2.3736          |
+| 2.3879        | 6.7692  | 22   | 2.1420          |
+| 1.5754        | 8.0     | 26   | 1.8716          |
+| 1.891         | 8.9231  | 29   | 1.7453          |
+| 1.7175        | 9.8462  | 32   | 1.6317          |
+| 1.6072        | 10.7692 | 35   | 1.5333          |
+| 1.0984        | 12.0    | 39   | 1.4520          |
+| 1.4336        | 12.9231 | 42   | 1.4211          |
+| 1.3754        | 13.8462 | 45   | 1.4011          |
+| 1.3729        | 14.7692 | 48   | 1.3903          |
+| 1.0266        | 16.0    | 52   | 1.3809          |
+| 1.3402        | 16.9231 | 55   | 1.3765          |
+| 1.3242        | 17.8462 | 58   | 1.3741          |
+| 0.9329        | 18.4615 | 60   | 1.3735          |
 ### Framework versions

runs/Oct24_19-05-18_1f4e1c060daf/events.out.tfevents.1729796719.1f4e1c060daf.3103.2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:00d6bcf330beb536171cf033817898b7044b291397206223d6518701f23e1ad0
+size 5582

runs/Oct24_19-05-33_1f4e1c060daf/events.out.tfevents.1729796735.1f4e1c060daf.3103.3 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9aaa4c130704c60c63e6721628e51acf3c5125c1e713a7c056d3b184107bedc6
+size 14917

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a0e8fcdc09c1c3add17a72743e05eae0a887cc42c5df7f15601730c27da7a629
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:7f72189ff66b7b37459efba8f7a033a1f20cad4e6281fcd855e614db15f527ca
 size 5176