Some bug when using function call with vllm==0.8.4
#4
by
waple
- opened
I simply transferred the function call demo from Transformers to Vllm and found:
RuntimeError: Failed running call_function (*((FakeTensor(..., device='cuda:0', size=(s0, 4096), dtype=torch.bfloat16), FakeTensor(..., device='cuda:0', size=(s0, 4096), dtype=torch.bfloat16)), Parameter(FakeTensor(..., device='cuda:0', size=(13696, 4096), dtype=torch.bfloat16)), None), **{}):
When I downgraded vllm to 0.8.3, it ran successfully.
This comment has been hidden (marked as Spam)