second-state/glm-4-9b-chat-1m-GGUF · llama_load_model_from

Does this model have to use LlamaEdge? I encountered the following error when using the llama_cpp package to Load the model
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA GeForce GTX 1080, compute capability 6.1, VMM: yes
Device 1: NVIDIA GeForce GTX 1080, compute capability 6.1, VMM: yes
llm_load_tensors: ggml ctx size = 0.14 MiB
llama_model_load: error loading model: check_tensor_dims: tensor 'blk.0.attn_qkv.weight' has wrong shape; expected 4096, 4608, got 4096, 5120, 1, 1
llama_load_model_from_file: failed to load model