Spaces:

unsloth
/

README

Running

App Files Files Community

Please Quantize MiniMaxAI/MiniMax-VL-01

by chilegazelle - opened 9 days ago

Discussion

chilegazelle

9 days ago

Hey everyone working on quantization!

Big thanks for all your work—your contributions to AI optimization are seriously appreciated.

Right now, MiniMaxAI/MiniMax-VL-01 is the best VL model out there, and a quantized version could take it even further. It would make the model more efficient, reduce compute costs, and make it more accessible for everyone.

If possible, it would be great to have a diverse range of quantized versions—optimized for different hardware, precision levels, and use cases. This way, more people can benefit from it, whether they're running it on consumer GPUs, cloud servers, or edge devices.

If anyone is up for it, that would be amazing! Huge thanks in advance!

shimmyshimmer

Unsloth AI org 8 days ago

Hey everyone working on quantization!

Big thanks for all your work—your contributions to AI optimization are seriously appreciated.

Right now, MiniMaxAI/MiniMax-VL-01 is the best VL model out there, and a quantized version could take it even further. It would make the model more efficient, reduce compute costs, and make it more accessible for everyone.

If possible, it would be great to have a diverse range of quantized versions—optimized for different hardware, precision levels, and use cases. This way, more people can benefit from it, whether they're running it on consumer GPUs, cloud servers, or edge devices.

If anyone is up for it, that would be amazing! Huge thanks in advance!

Hi there thank you, we've heard and we'll see what we can do. Unfortunately we do have to prioritize our time, costs etc :)

chilegazelle

6 days ago

•

edited 6 days ago

I understand. After all, according to VL benchmarks, this model is either on par with or worse than Qwen 2.5 VL, despite Qwen having significantly fewer parameters. However, as a Russian speaker who tested handwritten input in Russian, I personally found that MiniMax handled this specific task better, while Qwen 2.5 VL performed worse. So perhaps in this particular case, MiniMax might be the better choice. For many other reasons, MiniMax models didn’t generate much buzz. So yes, if I were in your place, I wouldn’t bother quantizing this model either. But thank you very much for responding to such a mass message, which could easily be classified as spam or a bot. I’m a real person—I was just prepared for only one company or individual to respond.

chilegazelle

6 days ago

Although, in the end, you were the only one who responded to my message. Once again, thank you so much for not ignoring this discussion!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment