Possible to add 4_0_4_4 and 4_0_4_8 quants?

#1
by asdfsdfssddf - opened

Hi, as the title says, I recently discovered my phone can run LLM's pretty well and even managed to get koboldcpp running on it to share text gen with horde. Would you consider providing the ARM inference optimized quants?

Is there a reason you couldn't just use the much higher quality imatrix ones instead? Should be same size and speed, but much higher quality.

mradermacher changed discussion status to closed

Sign up or log in to comment