ggml-org/gguf-my-repo · Converting Qwen-7B-QAnything to GGUF failed with KeyError: 'max_position

May 16, 2024

Hello,

I'm encountering an issue while converting the netease-youdao/Qwen-7B-QAnything model to GGUF format using the Huggingface Space for automatic model conversion. Detailed information is as follows:

Model: netease-youdao/Qwen-7B-QAnything
Quantization Methods attempted: Q4_K_M, Q8_0 (both resulted in the same error)

Here is the complete error trace:

Error: Error converting to fp16: b'INFO:hf-to-gguf:Loading model: Qwen-7B-QAnything\n
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only\n
INFO:hf-to-gguf:Set model parameters\n
Traceback (most recent call last):\n
  File "/home/user/app/llama.cpp/convert-hf-to-gguf.py", line 2546, in \n
    main()\n
  File "/home/user/app/llama.cpp/convert-hf-to-gguf.py", line 2528, in main\n
    model_instance.set_gguf_parameters()\n
  File "/home/user/app/llama.cpp/convert-hf-to-gguf.py", line 1576, in set_gguf_parameters\n
    self.gguf_writer.add_context_length(self.hparams["max_position_embeddings"])\n
KeyError: 'max_position_embeddings'\n'

It appears the script is expecting a max_position_embeddings parameter in the model's hparams, which is not present. This raises a KeyError.

Could you please advise on how to address this issue? Any help or guidance would be greatly appreciated.

Thank you!

reach-vb

ggml.ai org May 27, 2024

Hey! @BICENG - can you please retry again? We merged some more changes to make the repo more robust.
Furthermore, it appears that the model itself is custom and doesn't follow the same convention as Qwen in llama.cpp.

(closing this for now, please feel free to re-open a new issue if it persists)

reach-vb changed discussion status to closed May 27, 2024

Spaces:

ggml-org
/

gguf-my-repo

Running on A10G

Converting Qwen-7B-QAnything to GGUF failed with KeyError: 'max_position_embeddings'