irenewds/shawgpt-ft

Browse files

Files changed (3) hide show

README.md +21 -21
runs/Oct24_01-36-53_89957c487371/events.out.tfevents.1729733818.89957c487371.1061.1 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4926
 ## Model description
@@ -35,7 +35,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0001
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
@@ -51,25 +51,25 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.6196        | 0.9231  | 3    | 4.0996          |
-| 4.3186        | 1.8462  | 6    | 3.8038          |
-| 3.9621        | 2.7692  | 9    | 3.5173          |
-| 2.7151        | 4.0     | 13   | 3.1519          |
-| 3.3264        | 4.9231  | 16   | 2.9077          |
-| 3.0314        | 5.8462  | 19   | 2.6983          |
-| 2.7911        | 6.7692  | 22   | 2.5124          |
-| 1.9218        | 8.0     | 26   | 2.2823          |
-| 2.3602        | 8.9231  | 29   | 2.1229          |
-| 2.1297        | 9.8462  | 32   | 1.9649          |
-| 1.9877        | 10.7692 | 35   | 1.8543          |
-| 1.3751        | 12.0    | 39   | 1.7460          |
-| 1.7619        | 12.9231 | 42   | 1.6779          |
-| 1.6587        | 13.8462 | 45   | 1.6203          |
-| 1.6074        | 14.7692 | 48   | 1.5695          |
-| 1.1674        | 16.0    | 52   | 1.5241          |
-| 1.503         | 16.9231 | 55   | 1.5053          |
-| 1.4711        | 17.8462 | 58   | 1.4951          |
-| 1.0311        | 18.4615 | 60   | 1.4926          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.3489
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.00015
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.6062        | 0.9231  | 3    | 4.0362          |
+| 4.1781        | 1.8462  | 6    | 3.6161          |
+| 3.6808        | 2.7692  | 9    | 3.2051          |
+| 2.4181        | 4.0     | 13   | 2.7329          |
+| 2.845         | 4.9231  | 16   | 2.4553          |
+| 2.4808        | 5.8462  | 19   | 2.1947          |
+| 2.1606        | 6.7692  | 22   | 1.9599          |
+| 1.425         | 8.0     | 26   | 1.7251          |
+| 1.7116        | 8.9231  | 29   | 1.6083          |
+| 1.5518        | 9.8462  | 32   | 1.5094          |
+| 1.4834        | 10.7692 | 35   | 1.4522          |
+| 1.0371        | 12.0    | 39   | 1.3994          |
+| 1.3682        | 12.9231 | 42   | 1.3823          |
+| 1.3207        | 13.8462 | 45   | 1.3714          |
+| 1.3218        | 14.7692 | 48   | 1.3640          |
+| 0.9873        | 16.0    | 52   | 1.3561          |
+| 1.2872        | 16.9231 | 55   | 1.3521          |
+| 1.2689        | 17.8462 | 58   | 1.3495          |
+| 0.8971        | 18.4615 | 60   | 1.3489          |
 ### Framework versions

runs/Oct24_01-36-53_89957c487371/events.out.tfevents.1729733818.89957c487371.1061.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ca492e035774fdc9433dd1041a741e8ac513f785942f21a4c1edf49992a6c021
+size 14918

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5d24572c9554940e1f92edd78e10243cb6873325ac8bd219df60a27e7683379a
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:00f503e49e741f1ddad3d08ba2dd98cfccccd8d44fbe62354a8313f41c36708e
 size 5176