End of training

Files changed (8) hide show

README.md CHANGED Viewed

@@ -19,6 +19,8 @@ should probably proofread and complete it, then remove this comment. -->
 # zephyr-7b-sft-lora
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
 ## Model description
@@ -45,13 +47,14 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
-- num_epochs: 1
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
 | No log        | 1.0   | 1    | 1.1585          |
 ### Framework versions

 # zephyr-7b-sft-lora
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.1563
 ## Model description
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
+- num_epochs: 3
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
 | No log        | 1.0   | 1    | 1.1585          |
+| No log        | 2.0   | 3    | 1.1563          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8a9153adf247e26307caae2a48bc8fe9d5c40f3e0315878c4c2b8f7fa8d93041
 size 109086672

 version https://git-lfs.github.com/spec/v1
+oid sha256:bd498aa390277d4d5480c6044b88e597b352fe3a0278fb1c963bd55ebf39b619
 size 109086672

all_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
     "total_flos": 5899069012574208.0,
-    "train_loss": 0.5863452553749084,
-    "train_runtime": 93.4265,
     "train_samples": 100,
-    "train_samples_per_second": 0.717,
     "train_steps_per_second": 0.011
 }

 {
     "epoch": 1.0,
     "total_flos": 5899069012574208.0,
+    "train_loss": 0.5863915681838989,
+    "train_runtime": 88.6368,
     "train_samples": 100,
+    "train_samples_per_second": 0.756,
     "train_steps_per_second": 0.011
 }

runs/Apr08_14-24-03_39f6269d6750/events.out.tfevents.1712586255.39f6269d6750.561.2 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:5e850f6036dc045ca31bdd4815d0ee7fef227cdaf0aab8ef642f511eab455aa7
+size 5939

special_tokens_map.json CHANGED Viewed

@@ -13,7 +13,13 @@
     "rstrip": false,
     "single_word": false
   },
-  "pad_token": "</s>",
   "unk_token": {
     "content": "<unk>",
     "lstrip": false,

     "rstrip": false,
     "single_word": false
   },
+  "pad_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
   "unk_token": {
     "content": "<unk>",
     "lstrip": false,

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
     "total_flos": 5899069012574208.0,
-    "train_loss": 0.5863452553749084,
-    "train_runtime": 93.4265,
     "train_samples": 100,
-    "train_samples_per_second": 0.717,
     "train_steps_per_second": 0.011
 }

 {
     "epoch": 1.0,
     "total_flos": 5899069012574208.0,
+    "train_loss": 0.5863915681838989,
+    "train_runtime": 88.6368,
     "train_samples": 100,
+    "train_samples_per_second": 0.756,
     "train_steps_per_second": 0.011
 }

trainer_state.json CHANGED Viewed

@@ -10,19 +10,19 @@
   "log_history": [
     {
       "epoch": 1.0,
-      "eval_loss": 1.1567405462265015,
-      "eval_runtime": 24.4278,
-      "eval_samples_per_second": 2.62,
-      "eval_steps_per_second": 2.62,
       "step": 1
     },
     {
       "epoch": 1.0,
       "step": 1,
       "total_flos": 5899069012574208.0,
-      "train_loss": 0.5863452553749084,
-      "train_runtime": 93.4265,
-      "train_samples_per_second": 0.717,
       "train_steps_per_second": 0.011
     }
   ],

   "log_history": [
     {
       "epoch": 1.0,
+      "eval_loss": 1.1584750413894653,
+      "eval_runtime": 24.1178,
+      "eval_samples_per_second": 2.654,
+      "eval_steps_per_second": 2.654,
       "step": 1
     },
     {
       "epoch": 1.0,
       "step": 1,
       "total_flos": 5899069012574208.0,
+      "train_loss": 0.5863915681838989,
+      "train_runtime": 88.6368,
+      "train_samples_per_second": 0.756,
       "train_steps_per_second": 0.011
     }
   ],

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4986090b642bc14ee74878c04e50e4221377fe1e24527819cbd66ee69af07d71
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:40a1bb5dcc207e7129f2a045941c3c828bd157617df0503126a3864e20ad1063
 size 4984