End of training

Browse files

Files changed (4) hide show

README.md +4 -20
model-00001-of-00003.safetensors +1 -1
model-00002-of-00003.safetensors +1 -1
model-00003-of-00003.safetensors +1 -1

README.md CHANGED Viewed

@@ -4,18 +4,18 @@ base_model: meta-llama/Llama-2-7b-hf
 tags:
 - generated_from_trainer
 model-index:
-- name: sparse_llama_7b_hf2_refined_web_50p_2024-05-11
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# sparse_llama_7b_hf2_refined_web_50p_2024-05-11
 This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.2840
 ## Model description
@@ -43,26 +43,10 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- training_steps: 350
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 2.201         | 0.0   | 25   | 2.2172          |
-| 2.2379        | 0.0   | 50   | 2.2154          |
-| 2.1411        | 0.01  | 75   | 2.2137          |
-| 2.1523        | 0.01  | 100  | 2.2125          |
-| 2.5823        | 0.01  | 125  | 2.2103          |
-| 2.2672        | 0.01  | 150  | 2.2063          |
-| 2.3044        | 0.01  | 175  | 2.2036          |
-| 2.2119        | 0.02  | 200  | 2.2012          |
-| 2.1888        | 0.02  | 225  | 2.2004          |
-| 2.1592        | 0.02  | 250  | 2.1981          |
-| 2.2455        | 0.02  | 275  | 2.1972          |
-| 2.0666        | 0.02  | 300  | 2.1972          |
-| 2.322         | 0.03  | 325  | 2.1967          |
-| 2.2689        | 0.03  | 350  | 2.1946          |
 ### Framework versions

 tags:
 - generated_from_trainer
 model-index:
+- name: sparse_llama_7b_hf2_refined_web_50p_2024-05-12
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# sparse_llama_7b_hf2_refined_web_50p_2024-05-12
 This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.3152
 ## Model description
 - total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- training_steps: 10
 ### Training results
 ### Framework versions

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1a4fd2a79d8ffbe808d2c2ea4d0bd40d42d0124b1ee869536fe03b85f39d8009
 size 4938985352

 version https://git-lfs.github.com/spec/v1
+oid sha256:c298abed023a26104e5746bd2c50b57ba3700f70b77ea4956a5c9fe5c99ec1ef
 size 4938985352

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3ad5d08f9d9201c4efc073088ffaf7916d335b674dee7649ef51aa8d57046323
 size 4947390880

 version https://git-lfs.github.com/spec/v1
+oid sha256:dcb6b3532206e4fa19bff1a8b7b7359158fd4c73244833e28f216cda50508526
 size 4947390880

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:34edc553f00682dc58972738c70149fbaef17576b1f28e9fad845e55f2857b71
 size 3590488816

 version https://git-lfs.github.com/spec/v1
+oid sha256:0d8fb8d6984a0a5ca91ca2071cdcc88092dc763034c47bd2c111f7ee5ac706cd
 size 3590488816