Please make i1 quants of my latest 72b model

by rombodawg - opened 2 days ago

2 days ago

I really appreciate your work. I just released my best model yet. I would love it if you could make some i1 quants of it.

https://huggingface.co/Rombo-Org/Rombo-LLM-V3.0-Qwen-72b

mradermacher

Owner 2 days ago

Sure! it's queued, would be a shame if we didn't have quants for it :)

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Rombo-LLM-V3.0-Qwen-72b-GGUF for quants to appear.

mradermacher changed discussion status to closed 2 days ago

Smorty100

about 20 hours ago

@rombodawg hi! i would like to know why we would use i1 quants at all... like - the general consensis i think is that Q4 is pretty good, and anything below is kinda not worth it anymore, where just getting a smaller model with a Q4 makes more sense performance-wise.

i have heard of those 1.58 bit quants, which apparently perform surprisingly well, but im assuming those are not the same...

would you mind explaining it to me? I am very curious :)

mradermacher

Owner about 19 hours ago

i1-IQ4_XS is better quality then Q4_K_M at 20% smaller size. for example. and i1-IQ1_S quants are ~1.58bpw, and perform "surprisingly good" (but much worse than Q4_K).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment