FatCat87
/

76e772b1-fe02-45e3-aa5b-7b36ec7abf8d

@@ -1,12 +1,12 @@
 ---
-license: gemma
 library_name: peft
 tags:
 - axolotl
 - generated_from_trainer
-base_model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter2
 model-index:
-- name: 1ef754c7-c737-46d3-b5e2-e40eed5cff18
   results: []
 ---
@@ -19,19 +19,19 @@ should probably proofread and complete it, then remove this comment. -->
 axolotl version: `0.4.1`
 ```yaml
 adapter: lora
-base_model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter2
 bf16: auto
 datasets:
 - data_files:
-  - 54e090b2393cef7b_train_data.json
   ds_type: json
   format: custom
-  path: 54e090b2393cef7b_train_data.json
   type:
     field: null
-    field_input: phonemes
-    field_instruction: text
-    field_output: text_description
     field_system: null
     format: null
     no_input_format: null
@@ -51,7 +51,7 @@ fsdp_config: null
 gradient_accumulation_steps: 4
 gradient_checkpointing: true
 group_by_length: false
-hub_model_id: FatCat87/1ef754c7-c737-46d3-b5e2-e40eed5cff18
 learning_rate: 0.0002
 load_in_4bit: false
 load_in_8bit: true
@@ -73,7 +73,8 @@ sample_packing: true
 saves_per_epoch: 1
 seed: 701
 sequence_len: 4096
-special_tokens: null
 strict: false
 tf32: false
 tokenizer_type: AutoTokenizer
@@ -82,9 +83,9 @@ val_set_size: 0.1
 wandb_entity: fatcat87-taopanda
 wandb_log_model: null
 wandb_mode: online
-wandb_name: 1ef754c7-c737-46d3-b5e2-e40eed5cff18
 wandb_project: subnet56
-wandb_runid: 1ef754c7-c737-46d3-b5e2-e40eed5cff18
 wandb_watch: null
 warmup_ratio: 0.05
 weight_decay: 0.0
@@ -94,12 +95,12 @@ xformers_attention: null
 </details><br>
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/6423an98)
-# 1ef754c7-c737-46d3-b5e2-e40eed5cff18
-This model is a fine-tuned version of [UCLA-AGI/Gemma-2-9B-It-SPPO-Iter2](https://huggingface.co/UCLA-AGI/Gemma-2-9B-It-SPPO-Iter2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8791
 ## Model description
@@ -129,17 +130,17 @@ The following hyperparameters were used during training:
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 2
 - num_epochs: 1
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 4.954         | 0.04  | 1    | 4.9802          |
-| 1.398         | 0.28  | 7    | 1.2472          |
-| 0.944         | 0.56  | 14   | 0.9353          |
-| 0.8764        | 0.84  | 21   | 0.8791          |
 ### Framework versions

 ---
+license: apache-2.0
 library_name: peft
 tags:
 - axolotl
 - generated_from_trainer
+base_model: princeton-nlp/Sheared-LLaMA-1.3B
 model-index:
+- name: 76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
   results: []
 ---
 axolotl version: `0.4.1`
 ```yaml
 adapter: lora
+base_model: princeton-nlp/Sheared-LLaMA-1.3B
 bf16: auto
 datasets:
 - data_files:
+  - cb26f8bb8a47c11f_train_data.json
   ds_type: json
   format: custom
+  path: cb26f8bb8a47c11f_train_data.json
   type:
     field: null
+    field_input: null
+    field_instruction: instruction
+    field_output: output
     field_system: null
     format: null
     no_input_format: null
 gradient_accumulation_steps: 4
 gradient_checkpointing: true
 group_by_length: false
+hub_model_id: FatCat87/76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
 learning_rate: 0.0002
 load_in_4bit: false
 load_in_8bit: true
 saves_per_epoch: 1
 seed: 701
 sequence_len: 4096
+special_tokens:
+  pad_token: </s>
 strict: false
 tf32: false
 tokenizer_type: AutoTokenizer
 wandb_entity: fatcat87-taopanda
 wandb_log_model: null
 wandb_mode: online
+wandb_name: 76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
 wandb_project: subnet56
+wandb_runid: 76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
 wandb_watch: null
 warmup_ratio: 0.05
 weight_decay: 0.0
 </details><br>
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/edodtlvp)
+# 76e772b1-fe02-45e3-aa5b-7b36ec7abf8d
+This model is a fine-tuned version of [princeton-nlp/Sheared-LLaMA-1.3B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.5490
 ## Model description
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 9
 - num_epochs: 1
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 1.8322        | 0.0050 | 1    | 1.9699          |
+| 1.5589        | 0.2537 | 51   | 1.6483          |
+| 1.4753        | 0.5075 | 102  | 1.5744          |
+| 1.4788        | 0.7612 | 153  | 1.5490          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:750d7b7610c14e03cf13b82281a0bc5c1ab8820c530c96e747d518ccf2e52cbd
-size 432357050

 version https://git-lfs.github.com/spec/v1
+oid sha256:d0ba8c4ee480fe38718ba5baef8c771892e5806bacea580ba0421ba81448e4eb
+size 120052362