Possible to add 4_0_4_4 and 4_0_4_8 quants?

by asdfsdfssddf - opened about 1 month ago

about 1 month ago

Hi, as the title says, I recently discovered my phone can run LLM's pretty well and even managed to get koboldcpp running on it to share text gen with horde. Would you consider providing the ARM inference optimized quants?

mradermacher

Owner about 1 month ago

Is there a reason you couldn't just use the much higher quality imatrix ones instead? Should be same size and speed, but much higher quality.

mradermacher changed discussion status to closed about 1 month ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment