fix: modeling_deepseek.py should use `deepseek` instead of `deepseek_v2` architecture

#1

I have copied the file from https://huggingface.co/deepseek-ai/deepseek-moe-16b-chat/edit/main/modeling_deepseek.py

I believe that is the correct one since the model weight dict has matching keys (using the original self_attn architecture)

FIx,ed thanks!

llllvvuu changed pull request status to closed

Sign up or log in to comment