How to make Q_4_K_M with your GGUF converter?

#1
by PixelPlayer - opened

I load the full Flux 24 gb safetensors model into GGUF Convertor (Alpha) using ComfyUI_GGUF_windows_portable 0.0.10 (latest). At the output I get bf16.gguf with size 24 gb too, and that's all. How to get the desired quantization? For example Q_4_K_M? There are no settings in the node itself, maybe it should be set somewhere?

the convertor is merely passing the torch tensor to form a gguf file; to further quantize it, i.e., q8 to q2; need to comply the c/c++ script from llama.cpp and make an executable tool for doing that task, since those are all customized models, not following the hf standard and not on gg's list recently; we think we are able to make it a simple step and integrate it in our tool very soon; btw, you should be able to find the new gguf node from comfyui-manager right away, the new version doesn't need dependency; could use two nodes at the same time

ok, done; you can take the cutter here to make your own q4_k_m gguf; enjoy

Thank you for that! I'll try it.

Sign up or log in to comment