open_llama_3b_gguf / README.md
SlyEcho's picture
add files
7b9d996 verified
metadata
license: apache-2.0

gguf versions of OpenLLaMa 3B

Newer quantizations

There are now more quantization types in llama.cpp, some lower than 4 bits. Currently these are not supported, maybe because some weights have shapes that don't divide by 256.

Perplexity on wiki.test.406

Coming soon...