etemiz commited on
Commit
248c616
1 Parent(s): bcfb212

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -12,8 +12,9 @@ which is converted from Llama 3.1 405B:
12
  https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct
13
 
14
 
15
- llama.cpp version b3459
16
 
17
  imatrix file https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/blob/main/405imatrix.dat
18
 
19
- Lmk if you need bigger quants.
 
 
12
  https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct
13
 
14
 
15
+ llama.cpp version b3459. There is ongoing work in llama.cpp to support this model. If you use context = 8192 there are some reports that say this model works fine. If not, you can also try chaing the Frequency Base as described in: https://www.reddit.com/r/LocalLLaMA/comments/1ectacp/until_the_rope_scaling_is_fixed_in_gguf_for/
16
 
17
  imatrix file https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/blob/main/405imatrix.dat
18
 
19
+ Lmk if you need bigger quants.
20
+