3-bit quant

Files changed (3) hide show

LLama-2-MedText-13b-Q3_K_L.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:93371f4e513bdffdc3a7cc164068f9b18ce3287326af67675def93abeab8e2fb
+size 6929559424

LLama-2-MedText-13b-Q6_K.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:34602cc7158d458c82cec54c838e9bdf7691b95ce9256657b6a493b07886f91b
+size 10679140224

README.md CHANGED Viewed

@@ -85,6 +85,7 @@ Then quantize f32 GGUF to lower bit resolutions
 ```bash
 llama.cpp/build/bin/quantize LLama-2-MedText-13b-f32.gguf LLama-2-MedText-13b-Q3_K_L.gguf Q3_K_L
 ```
 ### Distributing model through huggingface

 ```bash
 llama.cpp/build/bin/quantize LLama-2-MedText-13b-f32.gguf LLama-2-MedText-13b-Q3_K_L.gguf Q3_K_L
+llama.cpp/build/bin/quantize LLama-2-MedText-13b-f32.gguf LLama-2-MedText-13b-Q6_K.gguf Q6_K
 ```
 ### Distributing model through huggingface