Jlonge4/rag-rel

Browse files

Files changed (12) hide show

README.md +19 -19
adapter_config.json +6 -6
adapter_model.safetensors +1 -1
runs/Oct13_18-18-49_48c91bda07e0/events.out.tfevents.1728843530.48c91bda07e0.9895.0 +3 -0
runs/Oct13_18-20-53_48c91bda07e0/events.out.tfevents.1728843654.48c91bda07e0.9895.1 +3 -0
runs/Oct13_18-21-27_48c91bda07e0/events.out.tfevents.1728843688.48c91bda07e0.11185.0 +3 -0
runs/Oct13_18-23-33_48c91bda07e0/events.out.tfevents.1728843814.48c91bda07e0.11185.1 +3 -0
runs/Oct13_18-24-23_48c91bda07e0/events.out.tfevents.1728843864.48c91bda07e0.12055.0 +3 -0
runs/Oct13_18-25-58_48c91bda07e0/events.out.tfevents.1728843959.48c91bda07e0.12055.1 +3 -0
runs/Oct13_18-26-54_48c91bda07e0/events.out.tfevents.1728844015.48c91bda07e0.12055.2 +3 -0
runs/Oct13_18-28-24_48c91bda07e0/events.out.tfevents.1728844105.48c91bda07e0.13277.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,12 +16,12 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/josh-longenecker1-groundedai/grounded-ai-rag-relevance/runs/bnvts1hx)
 # grounded-ai-rag-3
 This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4248
 - Rouge1: 1.0
 - Rouge2: 0.0
 - Rougel: 1.0
@@ -44,12 +44,12 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-05
 - train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 15
@@ -59,20 +59,20 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
-| 1.0354        | 5.0   | 5    | 1.7161          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.9393        | 10.0  | 10   | 1.5598          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.8098        | 15.0  | 15   | 1.3913          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.6939        | 20.0  | 20   | 1.2470          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.5662        | 25.0  | 25   | 1.0859          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.4027        | 30.0  | 30   | 0.9390          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.2595        | 35.0  | 35   | 0.8088          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.159         | 40.0  | 40   | 0.8187          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.1225        | 45.0  | 45   | 0.9043          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.0808        | 50.0  | 50   | 0.9985          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.0334        | 55.0  | 55   | 1.1227          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.0121        | 60.0  | 60   | 1.2501          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.0052        | 65.0  | 65   | 1.3779          | 1.0    | 0.0    | 1.0    | 1.0       |
-| 0.0043        | 70.0  | 70   | 1.4248          | 1.0    | 0.0    | 1.0    | 1.0       |
 ### Framework versions

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/josh-longenecker1-groundedai/grounded-ai-rag-relevance/runs/oius6vlo)
 # grounded-ai-rag-3
 This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.3813
 - Rouge1: 1.0
 - Rouge2: 0.0
 - Rougel: 1.0
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 7e-05
 - train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 15
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
+| 1.7953        | 5.0   | 5    | 1.8076          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 1.4603        | 10.0  | 10   | 1.6034          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 1.2107        | 15.0  | 15   | 1.4092          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.911         | 20.0  | 20   | 1.2036          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.5225        | 25.0  | 25   | 1.0263          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.2248        | 30.0  | 30   | 0.9228          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.1138        | 35.0  | 35   | 0.9692          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.0533        | 40.0  | 40   | 1.1089          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.0197        | 45.0  | 45   | 1.1951          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.0066        | 50.0  | 50   | 1.2534          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.0051        | 55.0  | 55   | 1.3186          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.0036        | 60.0  | 60   | 1.3523          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.0047        | 65.0  | 65   | 1.3669          | 1.0    | 0.0    | 1.0    | 1.0       |
+| 0.0037        | 70.0  | 70   | 1.3813          | 1.0    | 0.0    | 1.0    | 1.0       |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -11,7 +11,7 @@
   "layers_to_transform": null,
   "loftq_config": {},
   "lora_alpha": 32,
-  "lora_dropout": 0.1,
   "megatron_config": null,
   "megatron_core": "megatron.core",
   "modules_to_save": null,
@@ -20,13 +20,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "down_proj",
-    "v_proj",
     "gate_proj",
-    "q_proj",
-    "k_proj",
     "o_proj",
-    "up_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "layers_to_transform": null,
   "loftq_config": {},
   "lora_alpha": 32,
+  "lora_dropout": 0.3,
   "megatron_config": null,
   "megatron_core": "megatron.core",
   "modules_to_save": null,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "gate_proj",
+    "v_proj",
+    "up_proj",
+    "down_proj",
     "o_proj",
+    "q_proj",
+    "k_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:eee14e572e234fef48bd05e6751732a7be5fcc36a58cb029f509a042420f7e66
 size 35668592

 version https://git-lfs.github.com/spec/v1
+oid sha256:d3adb363beb9ab4b7603afdf9abfceea13dc71393dd559126e4c5f1df8b239fc
 size 35668592

runs/Oct13_18-18-49_48c91bda07e0/events.out.tfevents.1728843530.48c91bda07e0.9895.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:933177eae3f100d17cda6870d2222c8910882ec6705c96f62102436ccf46c6e4
+size 22254

runs/Oct13_18-20-53_48c91bda07e0/events.out.tfevents.1728843654.48c91bda07e0.9895.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e741a15fb17b24228f952ec5a01d31346506609fbd23f667f40555841d40097f
+size 4184

runs/Oct13_18-21-27_48c91bda07e0/events.out.tfevents.1728843688.48c91bda07e0.11185.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e9c9fe55b25f7f812326a5f0f57a62ea987089f6b71af5534caf4645500b3384
+size 16462

runs/Oct13_18-23-33_48c91bda07e0/events.out.tfevents.1728843814.48c91bda07e0.11185.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e5d59df0a98e21af953b120fcca1c64f00542da9d6e04adedf5ea7f8d9fa2f3
+size 4184

runs/Oct13_18-24-23_48c91bda07e0/events.out.tfevents.1728843864.48c91bda07e0.12055.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1592c21d085aebf049083aece02af2032530633eec2d56766c9619959270b8c2
+size 17341

runs/Oct13_18-25-58_48c91bda07e0/events.out.tfevents.1728843959.48c91bda07e0.12055.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2dafe95867f37ec3f6a9da93b33af24decd93bd74d94d32701e3869060fbc2c2
+size 22047

runs/Oct13_18-26-54_48c91bda07e0/events.out.tfevents.1728844015.48c91bda07e0.12055.2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:86b00a6264d8feea50543db80acbce2b72d2fb9cf3b6969f2b314e15e2daee01
+size 20340

runs/Oct13_18-28-24_48c91bda07e0/events.out.tfevents.1728844105.48c91bda07e0.13277.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:db988e43c51d8b4eecf4b01b483d6de0d8341743890ad95aa535fd127d4464d4
+size 29481

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0aaa190a3cb68a15b368829d3dff9cf308f2721e76ebc2b27b8b0ec9bc845f20
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:a889510aa783638fd8cea9b7ae4b0fa932377333e913d51cf0ca21ff3e6acc08
 size 5432