Unexpected `inv_freq` buffers in the checkpoint
#6
by
sadra-barikbin
- opened
Hi,
As of transformers==4.32.0
(the latest is 4.33.2
), inv_freq
buffers of rotary embeddings in the model are not part of the state dict, hence producing error on loading checkpoints.
To reproduce:
from transformers import LlamaForCausalLM, LlamaConfig
config_dict = json.load(open("config.json"))
config = LlamaConfig(**config_dict)
model = LlamaForCausalLM(config)
state_dict = torch.load(open("pytorch_model.bin", 'rb'))
model.load_state_dict(state_dict)
Output:
Error(s) in loading state_dict for LlamaForCausalLM:
Unexpected key(s) in state_dict: "model.layers.0.self_attn.rotary_emb.inv_freq", "model.layers.1.self_attn.rotary_emb.inv_freq", "model.layers.2.self_attn.rotary_emb.inv_freq", "model.layers.3.self_attn.rotary_emb.inv_freq", ...
I suffer the same issue.
just delete the keys include inv_freq
of state dict can solve this issue, inv_freq
is just a constant.
https://github.com/huggingface/transformers/pull/24998