Quant-Cartel
/

Mistral-Small-NovusKyver-iMat-GGUF

Text Generation

Not-For-All-Audiences

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Sep 20, 2024

Commit

59d5a05

·

verified ·

1 Parent(s): c65ef49

Update README.md

Files changed (1) hide show

README.md +8 -1

README.md CHANGED Viewed

@@ -27,11 +27,18 @@ C8888     "8" 888 888 "  d88888 d88 88b 888
 PROUDLY PRESENTS
 ```
 # Mistral-Small-NovusKyver-iMat-GGUF
 Quantized with love from fp32.
 Original model author: [envoid](https://huggingface.co/Envoid/)
-* Importance Matrix calculated using [groups_merged.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) in 105 chunks, n_ctx=512, and fp32 precision weights
 Original model README [here](https://huggingface.co/Envoid/Mistral-Small-NovusKyver/) and below:

 PROUDLY PRESENTS
 ```
 # Mistral-Small-NovusKyver-iMat-GGUF
+>[!TIP]
+> <b>Quantization Note:</b> For smaller sizes (i.e. IQ3 and below) a repetition penalty of 1.05-1.15 is recommended.
 Quantized with love from fp32.
 Original model author: [envoid](https://huggingface.co/Envoid/)
+* Importance Matrix calculated using [groups_merged.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
+*  105 chunks
+*  n_ctx=512
+*  Calculation uses fp32 precision model weights
 Original model README [here](https://huggingface.co/Envoid/Mistral-Small-NovusKyver/) and below: