Missing attention_mask on hook

#44
by riedgar-ms - opened

I'm attempting to use Phi-4 with the attention steering approach of PASTA. However, this is running into trouble because the hook set on the self-attention layer is not being passed an attention_mask; the argument is present when the hooked function is called, but it is set to None. Is this expected? The same hook on Phi-3 works fine.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment