nan values in the hidden_states
This checkpoint produces nan values in the hidden_states. It sometimes stuck at inference too.
Hello, thank you for your feedback. Could you please provide a simple example so that I can reproduce the issue? I haven't encountered this model outputting NaN hidden states.
line 1119, in forward
position_ids = position_ids.view(-1, seq_length).long()
RuntimeError: shape '[-1, 0]' is invalid for input of size 846
I keep getting this error.
line 1119, in forward
position_ids = position_ids.view(-1, seq_length).long()
RuntimeError: shape '[-1, 0]' is invalid for input of size 846I keep getting this error.
Thanks for the feedback. This problem should be fixed yesterday, you can try it again.
This error is likely due to differences in transformer versions. Please try transformers==4.47.1