May be a bug

#24
by DSY001 - opened

https://huggingface.co/Qwen/Qwen-VL-Chat/blob/f57cfbd358cb56b710d963669ad1bcfb44cdcdd8/modeling_qwen.py#L616

I think that shoule be:

past_length = past_key_values[0][0].size(-3)

because the past_key_values[0][0].shape is [bz, seq_len, num_layer, dim].

Sign up or log in to comment