README.md · etemiz/Llama-3.1-405B-Inst-GGUF at 8a18dbaa67a5cedf359d5308231e6e6d582345c8

metadata

license: llama3.1

Llama 3.1 405B Quants

llama.cpp version b3459. There is ongoing work in llama.cpp to support this model. If you use context = 8192 there are some reports that say this model works fine. If not, you can also try chaing the Frequency Base as described in: https://www.reddit.com/r/LocalLLaMA/comments/1ectacp/until_the_rope_scaling_is_fixed_in_gguf_for/

Lmk if you need bigger quants.