@onekq on Hugging Face: "Heard good things about this model and no inference providers support it ...…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

onekq

posted an update about 12 hours ago

Post

491

Heard good things about this model and no inference providers support it ...

THUDM/GLM-4-9B-0414

JLouisBiz

about 12 hours ago

it works on the llama.cpp

It is how you can run it:

llama-server -ngl 999 --host 192.168.1.68 --override-kv glm4.rope.dimension_count=int:64 --override-kv tokenizer.ggml.eos_token_id=int:151336 -m /mnt/nvme0n1/LLM/quantized/GLM-4-9B-0414-Q8_0.gguf

Read here why:

Eval bug: GLM-Z1-9B-0414 · Issue #12946 · ggml-org/llama.cpp:
https://github.com/ggml-org/llama.cpp/issues/12946#issuecomment-2803564782

onekq

about 10 hours ago

Ah I see. they have their own architecture.

https://github.com/huggingface/transformers/pull/37388

This will be hard.

In this post

onekq Yi Cui
JLouisBiz Jean Louis