dwb2023/llama38binstruct_summarize

Files changed (5) hide show

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) on the generator dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.8739
 ## Model description
@@ -50,12 +50,12 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 1.4551        | 1.25  | 25   | 1.6436          |
-| 0.4259        | 2.5   | 50   | 1.7066          |
-| 0.243         | 3.75  | 75   | 1.8454          |
-| 0.1138        | 5.0   | 100  | 1.8739          |
 ### Framework versions

 This model is a fine-tuned version of [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.0847
 ## Model description
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 1.4079        | 1.1905 | 25   | 1.4325          |
+| 0.3935        | 2.3810 | 50   | 1.6786          |
+| 0.3836        | 3.5714 | 75   | 1.7694          |
+| 0.1039        | 4.7619 | 100  | 2.0847          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,13 +20,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "o_proj",
     "up_proj",
-    "gate_proj",
-    "v_proj",
     "k_proj",
-    "q_proj",
-    "down_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "up_proj",
     "k_proj",
+    "o_proj",
+    "down_proj",
+    "v_proj",
+    "gate_proj",
+    "q_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f1f24d7bb57b1d7f3cfb0f233972104f536597227c953d73af4de04f854b8130
 size 167832240

 version https://git-lfs.github.com/spec/v1
+oid sha256:d5eb2ba5bd31fe265e089ab86d5ca7f0167c063cc985c997332e6d56c04fb999
 size 167832240

runs/Jun13_08-42-32_19d0f881613b/events.out.tfevents.1718268154.19d0f881613b.3120.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:43f0eb884c762790559d23d7e94de2b68de5fd2f24f6934855771c4b6a889ee4
+size 9238

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:683bfcec7c2df723778f1e13f47da6119b6028538f0425e04759cf2e99f62564
 size 5368

 version https://git-lfs.github.com/spec/v1
+oid sha256:88ec13acf05418ea06f7ef2cb995791d2f2ad262c2272593cce204cea317952b
 size 5368