Failed to convert weights into HuggingFace format
#11
by
hrezaei
- opened
I manged to convert format of 7B_200B_1 into transformers format, but for 7B_200B_4 an error like this occures:KeyError: 'layers.29.attention.wq.weight'
The command I run is like:
python /src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir ~/multi-token-prediction/7B_200B_4 --model_size 7B --output_dir ~/llama-multi-token/7B_200B_4
And the stack trace is like this:
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Saving a LlamaTokenizerFast to ~/llama-multi-token/7B_200B_4.
1 32 32 4096
Fetching all parameters from the checkpoint at ~/multi-token-prediction/7B_200B_4.
Traceback (most recent call last):
File "src/transformers/models/llama/convert_llama_weights_to_hf.py", line 415, in <module>
main()
File "src/transformers/models/llama/convert_llama_weights_to_hf.py", line 403, in main
write_model(
File "src/transformers/models/llama/convert_llama_weights_to_hf.py", line 175, in write_model
loaded[f"layers.{layer_i}.attention.wq.weight"], n_heads=n_heads
KeyError: 'layers.29.attention.wq.weight'
Any workaround would be appreciated.