irenewds/shawgpt-ft

Browse files

Files changed (4) hide show

README.md +16 -16
runs/Oct24_19-20-12_1f4e1c060daf/events.out.tfevents.1729797618.1f4e1c060daf.3103.6 +3 -0
runs/Oct24_19-20-27_1f4e1c060daf/events.out.tfevents.1729797628.1f4e1c060daf.3103.7 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4675
 ## Model description
@@ -35,7 +35,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.00015
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
@@ -51,20 +51,20 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.6055        | 0.9231  | 3    | 4.0311          |
-| 4.1731        | 1.8462  | 6    | 3.6064          |
-| 3.6829        | 2.7692  | 9    | 3.2058          |
-| 2.4342        | 4.0     | 13   | 2.7664          |
-| 2.8888        | 4.9231  | 16   | 2.4957          |
-| 2.5378        | 5.8462  | 19   | 2.2613          |
-| 2.2353        | 6.7692  | 22   | 2.0239          |
-| 1.4921        | 8.0     | 26   | 1.8148          |
-| 1.8263        | 8.9231  | 29   | 1.7127          |
-| 1.6847        | 9.8462  | 32   | 1.6285          |
-| 1.6188        | 10.7692 | 35   | 1.5605          |
-| 1.13          | 12.0    | 39   | 1.4999          |
-| 1.488         | 12.9231 | 42   | 1.4764          |
-| 1.2276        | 13.8462 | 45   | 1.4675          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.3630
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0002
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.5929        | 0.9231  | 3    | 3.9701          |
+| 4.0424        | 1.8462  | 6    | 3.4314          |
+| 3.4326        | 2.7692  | 9    | 2.9386          |
+| 2.1966        | 4.0     | 13   | 2.4495          |
+| 2.5197        | 4.9231  | 16   | 2.1497          |
+| 2.0957        | 5.8462  | 19   | 1.8726          |
+| 1.8112        | 6.7692  | 22   | 1.7120          |
+| 1.2405        | 8.0     | 26   | 1.5321          |
+| 1.4946        | 8.9231  | 29   | 1.4462          |
+| 1.3901        | 9.8462  | 32   | 1.4060          |
+| 1.3781        | 10.7692 | 35   | 1.3861          |
+| 0.9873        | 12.0    | 39   | 1.3718          |
+| 1.3301        | 12.9231 | 42   | 1.3656          |
+| 1.1055        | 13.8462 | 45   | 1.3630          |
 ### Framework versions

runs/Oct24_19-20-12_1f4e1c060daf/events.out.tfevents.1729797618.1f4e1c060daf.3103.6 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:654c73eef1e15f6158be30b65d9d84f88930b18bbf2f42d4568a05e72f1f74b2
+size 4184

runs/Oct24_19-20-27_1f4e1c060daf/events.out.tfevents.1729797628.1f4e1c060daf.3103.7 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a8a026ada6fdf7963a9492b6b3bf95a7c89b670666485d5446a5b2e0a0651d1c
+size 12552

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:023476b94d79af97ff9f39f01d7facdda61b8aaaaff1a07986c51bf0f5a941a0
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:c99249e625a9cbea47a8357a4c23c8d42a0c77e17f4dcbd3df88a0e58da6dfe8
 size 5176