I use vllm==0.6.3 load this model,it generate the fowllowing error

#1
by wc-llm - opened

when I use vllm==0.6.3 load this model,it generate the fowllowing error
File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/parameter.py", line 133, in load_qkv_weight
assert param_data.shape == loaded_weight.shape
AssertionError

torch.Size([6144, 768]) != torch.Size([4608, 768])

OpenGVLab org

Thank you for your question! Currently, we only support loading AWQ models with lmdeploy. We appreciate your feedback and will consider expanding support in the future.

czczup changed discussion status to closed

Sign up or log in to comment