sylyas
/

123e4567-e89b-12d3-a456-426614174004

Generated from Trainer

8-bit precision

Model card Files Files and versions Community

sylyas commited on Nov 20, 2024

Commit

fccf47b

·

verified ·

1 Parent(s): f25be64

End of training

Files changed (2) hide show

README.md +8 -8
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -30,8 +30,8 @@ datasets:
     field_input: input
     field_instruction: instruction
     field_output: output
-    format: '{input}'
-    no_input_format: '{field_instruction}'
     system_format: '{system}'
     system_prompt: ''
 debug: null
@@ -102,7 +102,7 @@ xformers_attention: null
 This model is a fine-tuned version of [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.9130
 ## Model description
@@ -130,16 +130,16 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
-- training_steps: 481
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.7333        | 0.0021 | 1    | 1.9654          |
-| 0.8852        | 0.2516 | 121  | 0.9382          |
-| 0.8645        | 0.5031 | 242  | 0.9225          |
-| 1.0676        | 0.7547 | 363  | 0.9130          |
 ### Framework versions

     field_input: input
     field_instruction: instruction
     field_output: output
+    format: '{instruction} {input}'
+    no_input_format: '{instruction}'
     system_format: '{system}'
     system_prompt: ''
 debug: null
 This model is a fine-tuned version of [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.8958
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
+- training_steps: 475
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.3737        | 0.0021 | 1    | 1.6453          |
+| 0.9206        | 0.2505 | 119  | 0.9274          |
+| 1.0707        | 0.5011 | 238  | 0.9007          |
+| 1.0721        | 0.7516 | 357  | 0.8958          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cf1138e4d78d9a6d81cdfa06f98986b150f914bab624559c952d7489be29b535
 size 167934026

 version https://git-lfs.github.com/spec/v1
+oid sha256:e0aa059fd1c09734db9ab6011a808407433369ec1f56ce9ddde4dcc475dc9ad1
 size 167934026