Some bug when using function call with vllm==0.8.4

by waple - opened 6 days ago

6 days ago

I simply transferred the function call demo from Transformers to Vllm and found:

RuntimeError: Failed running call_function (*((FakeTensor(..., device='cuda:0', size=(s0, 4096), dtype=torch.bfloat16), FakeTensor(..., device='cuda:0', size=(s0, 4096), dtype=torch.bfloat16)), Parameter(FakeTensor(..., device='cuda:0', size=(13696, 4096), dtype=torch.bfloat16)), None), **{}):

When I downgraded vllm to 0.8.3, it ran successfully.

zRzRzRzRzRzRzR

Z.ai & THUKEG org 6 days ago

https://github.com/vllm-project/vllm/pull/16618 Fix the inference problem of GLM-4-0414

waple

6 days ago

This comment has been hidden (marked as Spam)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment