irenewds/shawgpt-ft

Browse files

Files changed (3) hide show

README.md +16 -21
runs/Oct24_19-48-07_1f4e1c060daf/events.out.tfevents.1729799300.1f4e1c060daf.20197.1 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.3316
 ## Model description
@@ -44,32 +44,27 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- num_epochs: 20
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.6062        | 0.9231  | 3    | 4.0336          |
-| 4.1724        | 1.8462  | 6    | 3.6011          |
-| 3.669         | 2.7692  | 9    | 3.1876          |
-| 2.4069        | 4.0     | 13   | 2.7142          |
-| 2.8308        | 4.9231  | 16   | 2.4288          |
-| 2.4589        | 5.8462  | 19   | 2.1674          |
-| 2.1294        | 6.7692  | 22   | 1.9285          |
-| 1.4108        | 8.0     | 26   | 1.7008          |
-| 1.6925        | 8.9231  | 29   | 1.5824          |
-| 1.5326        | 9.8462  | 32   | 1.4923          |
-| 1.4721        | 10.7692 | 35   | 1.4347          |
-| 1.0272        | 12.0    | 39   | 1.3852          |
-| 1.3601        | 12.9231 | 42   | 1.3676          |
-| 1.3123        | 13.8462 | 45   | 1.3553          |
-| 1.3121        | 14.7692 | 48   | 1.3469          |
-| 0.9808        | 16.0    | 52   | 1.3382          |
-| 1.2776        | 16.9231 | 55   | 1.3343          |
-| 1.2614        | 17.8462 | 58   | 1.3322          |
-| 0.888         | 18.4615 | 60   | 1.3316          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.4750
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 15
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.6053        | 0.9231  | 3    | 4.0290          |
+| 4.1741        | 1.8462  | 6    | 3.6093          |
+| 3.6913        | 2.7692  | 9    | 3.2120          |
+| 2.4435        | 4.0     | 13   | 2.7681          |
+| 2.9111        | 4.9231  | 16   | 2.5080          |
+| 2.5883        | 5.8462  | 19   | 2.2860          |
+| 2.3061        | 6.7692  | 22   | 2.0874          |
+| 1.5537        | 8.0     | 26   | 1.8540          |
+| 1.8704        | 8.9231  | 29   | 1.7352          |
+| 1.7152        | 9.8462  | 32   | 1.6456          |
+| 1.647         | 10.7692 | 35   | 1.5787          |
+| 1.1487        | 12.0    | 39   | 1.5120          |
+| 1.5072        | 12.9231 | 42   | 1.4856          |
+| 1.2433        | 13.8462 | 45   | 1.4750          |
 ### Framework versions

runs/Oct24_19-48-07_1f4e1c060daf/events.out.tfevents.1729799300.1f4e1c060daf.20197.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a41b404acc35ce63e56698c03510ad7f516a90e9fff9bda1b6cebc3b2ec53ed0
+size 12553

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6a9e4f7d7f8a5b75519fafab0eb9b2fdd1d4572066c79fec4b297ee1e7af666d
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:3c76033ac543464722332330b433f8981f5b66935e895d6b2bbb6f492057adcb
 size 5176