Request for 4-bit or 5-bit Quantized Version of Aya-Vision-32B

by alexeyd76 - opened about 16 hours ago

about 16 hours ago

Would it be possible to release a 4-bit or 5-bit quantized version of the Aya-Vision-32B model? This would allow it to fit within a 24GB VRAM GPU, making it more accessible for local inference. A GGUF or GPTQ quantized version would be especially helpful.
Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment