datajose/pruebas-ft

Files changed (4) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5016
 ## Model description
@@ -36,31 +36,25 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 2.2715        | 0.96  | 20   | 1.7064          |
-| 0.71          | 1.98  | 41   | 1.1687          |
-| 0.5515        | 2.99  | 62   | 1.0146          |
-| 0.5052        | 4.0   | 83   | 0.8605          |
-| 0.4887        | 4.96  | 103  | 0.7023          |
-| 0.4311        | 5.98  | 124  | 0.6066          |
-| 0.418         | 6.99  | 145  | 0.5606          |
-| 0.4088        | 8.0   | 166  | 0.5206          |
-| 0.4243        | 8.96  | 186  | 0.5048          |
-| 0.3898        | 9.64  | 200  | 0.5016          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.7598
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 20
+- eval_batch_size: 20
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 80
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- num_epochs: 4
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.334         | 1.0   | 192  | 0.8773          |
+| 0.8235        | 2.0   | 385  | 0.8020          |
+| 0.7665        | 3.0   | 578  | 0.7718          |
+| 0.7357        | 3.98  | 768  | 0.7598          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4b8f41a103ccfef48a0c7101fd706957a4b02d57796fae22b701eec88a4af294
 size 8397056

 version https://git-lfs.github.com/spec/v1
+oid sha256:5e379e2f988aad18294bce115ce2a5a4b71dc9615cfc746c1ad847741e9af26b
 size 8397056

runs/Mar12_09-59-19_datajose/events.out.tfevents.1710248364.datajose.6585.1 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:07f962b31d666c90b8c9dfc435cfee8927ed0d62f5973279e965730bec976558
+size 5222

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:30e010fd8cbc265a352f57d16ac7c99310ada8ea19b2d05a7f4aa4a081d8dd63
 size 4856

 version https://git-lfs.github.com/spec/v1
+oid sha256:1eba88998a55b1eeeac22e77eb413c9ac68fa2fbf92e95166cdfa22737d5757d
 size 4856