This is a quantized GGUF of mistralai/Mistral-Nemo-Instruct-2407. Requires llama.cpp newer than commit 50e0535 (7/22/2024) to run inference.

Currently, we just have a Q5_K quantization which comes in at 8.73 GB. If you're interested other quantizations, just ping me @iamlemec on Twitter.

Downloads last month
12
GGUF
Model size
12.2B params
Architecture
llama
Hardware compatibility
Log In to view the estimation
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support