Request for 4-bit or 5-bit Quantized Version of Aya-Vision-32B
#4
by
alexeyd76
- opened
Would it be possible to release a 4-bit or 5-bit quantized version of the Aya-Vision-32B model? This would allow it to fit within a 24GB VRAM GPU, making it more accessible for local inference. A GGUF or GPTQ quantized version would be especially helpful.
Thanks!