Inference:

./llama-qwen2vl-cli -m Q8_0.gguf --mmproj qwen2vl-vision.gguf -p "Describe this image." --image "demo.jpg"

Converted using this Colab Notebook:

HimariO for the excellent work on enabling quantization for Qwen2-VL! PR on GitHub

GGUF

Model size

1.54B params

Architecture

qwen2vl

4-bit

8-bit

16-bit

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Base model

Qwen/Qwen2-VL-2B

Finetuned

Quantized

(37)

this model