thrunlab
/

Mistral_Sparse_refined_web_50p_graceful_True

Text Generation

Generated from Trainer

Model card Files Files and versions Community

lukeleeai commited on Mar 10, 2024

Commit

cd0152a

•

1 Parent(s): 8e64e99

End of training

Files changed (3) hide show

README.md +1 -1
config.json +4 -1
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 10.3563
 ## Model description

 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 10.3587
 ## Model description

config.json CHANGED Viewed

@@ -21,7 +21,10 @@
   "rms_norm_eps": 1e-06,
   "rope_theta": 10000.0,
   "sliding_window": 4096,
-  "thresholds": null,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
   "transformers_version": "4.37.2",

   "rms_norm_eps": 1e-06,
   "rope_theta": 10000.0,
   "sliding_window": 4096,
+  "thresholds": [
+    0.0,
+    0.0
+  ],
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
   "transformers_version": "4.37.2",

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:34c422ac3e1828ac57ac72875c633bd770e41ffa32cd88451e288f4a2b08550c
 size 16567728

 version https://git-lfs.github.com/spec/v1
+oid sha256:eaf4b5f5128df627b2a98fa0fe2ef9caf0eeeff68559ab8f983b496c9a21bd2f
 size 16567728