Request for 4-bit or 5-bit Quantized Version of Aya-Vision-32B

#4
by alexeyd76 - opened

Would it be possible to release a 4-bit or 5-bit quantized version of the Aya-Vision-32B model? This would allow it to fit within a 24GB VRAM GPU, making it more accessible for local inference. A GGUF or GPTQ quantized version would be especially helpful.
Thanks!

Sign up or log in to comment