Nexesenex
/

MIstral-QUantized-70b_Miqu-1-70b-iMat.GGUF

Inference Endpoints

Model card Files Files and versions Community

Nexesenex commited on Jun 4, 2024

Commit

5305ed3

·

verified ·

1 Parent(s): 75b9f10

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -34,6 +34,8 @@ Full offload possible on 16GB VRAM with a decent context size.
 Bonus : a Kobold.CPP Frankenstein which reads IQ3_XXS models and is not affected by the Kobold.CPP 1.56/1.57 slowdown at the cost of an absent Mixtral fix.
 https://github.com/Nexesenex/kobold.cpp/releases/tag/v1.57_b2030
 ---

 Bonus : a Kobold.CPP Frankenstein which reads IQ3_XXS models and is not affected by the Kobold.CPP 1.56/1.57 slowdown at the cost of an absent Mixtral fix.
 https://github.com/Nexesenex/kobold.cpp/releases/tag/v1.57_b2030
+Now supperseded with another KCPP-F, with 13 different KV cache quantization lebel to chose from :
+https://github.com/Nexesenex/kobold.cpp/releases
 ---