Text Generation
Transformers
PyTorch
llama
text-generation-inference
Inference Endpoints

Unexpected `inv_freq` buffers in the checkpoint

#6
by sadra-barikbin - opened

Hi,
As of transformers==4.32.0 (the latest is 4.33.2), inv_freq buffers of rotary embeddings in the model are not part of the state dict, hence producing error on loading checkpoints.

To reproduce:

from transformers import LlamaForCausalLM, LlamaConfig

config_dict = json.load(open("config.json"))
config = LlamaConfig(**config_dict)

model = LlamaForCausalLM(config)
state_dict = torch.load(open("pytorch_model.bin", 'rb'))
model.load_state_dict(state_dict)

Output:

Error(s) in loading state_dict for LlamaForCausalLM:
    Unexpected key(s) in state_dict: "model.layers.0.self_attn.rotary_emb.inv_freq", "model.layers.1.self_attn.rotary_emb.inv_freq", "model.layers.2.self_attn.rotary_emb.inv_freq", "model.layers.3.self_attn.rotary_emb.inv_freq", ...

I suffer the same issue.

just delete the keys include inv_freq of state dict can solve this issue, inv_freq is just a constant.
https://github.com/huggingface/transformers/pull/24998

Sign up or log in to comment