Quant with imatrix?

#1
by cyberjunk - opened

I've attempted to add imatrix and quant it to q5_K_M, I'm having a few issues at the moment with my machine so I'm just testing a direct conversion to q8_0 ,
do you think imatrix might make up for the loss at lower vram usages?

please get in touch, love your work

Hey, Sorry to get back so late, I havent really played with Quantization with imatrix so you are more of an expert in that department.
=)

I try to prune my models to keep the full precision and fit under 24 GB.
Thank you for your kind words, I wish I could help, maybe message someone that releases imatrix.
I am very interested in how a pruned model performs after being quantized as I feel I removed many redundant layers that might have been what makes quantized models perform ok. Please stay in touch. I am very curious and want to learn more! :D

Reach out to me:
https://www.linkedin.com/in/troyandrewschultz/

Sign up or log in to comment