8bits quantization
#20
by
ramkumarkoppu
- opened
Hi @Unsloth team for the great work. Can you please provide the instructions if I want to quantize to 8bits locally on my linux system where model weights downloaded from https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main to reproduce these quantized model files in the directory DeepSeek-R1-Q8_0 locally
The R1 model is already 8bit by default :)
I am confused more, the model weights in the repo https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main
tells me differently
The large matrices are fp8. Specifically F8_e4m3
so, what @unsloth team done to create https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-Q8_0 from https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main ? what are the reproduction steps?