Qwen
/

Qwen2.5-VL-72B-Instruct

Image-Text-to-Text

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

GGUF

#19

by iqdddd - opened 1 day ago

iqdddd

1 day ago

I would be grateful if you could explain why there is still no GGUF version of the model. I mean, BNB 4-bit doesn't work on a consumer-grade 24GB GPU. It would be interesting to see Q3 or I3 versions. It appears that GGUF versions have been made for the Qwen 2 VL model. Could there be some technical challenges preventing the creation of a GGUF version for 2.5?

about 12 hours ago

Actually, several llama.cpp repositories have already implemented the Qwen2.5-VL architecture and previously uploaded GGUF files. However, some users deployed these GGUF files to applications like the official llama.cpp, LM Studio, or Ollama, which do not yet support this architecture. Moreover, these users have raised some issues, which have been quite annoying for the developers, leading them to shut down the GGUF model repositories. If you want to test it, you can find these forked llama.cpp repositories on GitHub and build a GGUF file yourself for testing.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment