Please make i1 quants of my latest 72b model

#1
by rombodawg - opened

I really appreciate your work. I just released my best model yet. I would love it if you could make some i1 quants of it.

https://huggingface.co/Rombo-Org/Rombo-LLM-V3.0-Qwen-72b

Sure! it's queued, would be a shame if we didn't have quants for it :)

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Rombo-LLM-V3.0-Qwen-72b-GGUF for quants to appear.

mradermacher changed discussion status to closed

@rombodawg hi! i would like to know why we would use i1 quants at all... like - the general consensis i think is that Q4 is pretty good, and anything below is kinda not worth it anymore, where just getting a smaller model with a Q4 makes more sense performance-wise.

i have heard of those 1.58 bit quants, which apparently perform surprisingly well, but im assuming those are not the same...

would you mind explaining it to me? I am very curious :)

i1-IQ4_XS is better quality then Q4_K_M at 20% smaller size. for example. and i1-IQ1_S quants are ~1.58bpw, and perform "surprisingly good" (but much worse than Q4_K).

Sign up or log in to comment