SlyEcho
/

open_llama_3b_gguf

Inference Endpoints

Model card Files Files and versions Community

open_llama_3b_gguf / README.md

SlyEcho's picture

add files

7b9d996 verified over 1 year ago

|

history blame contribute delete

694 Bytes

	---
	license: apache-2.0
	---

	# gguf versions of OpenLLaMa 3B

	- Version: 1T tokens final version
	- Project: [OpenLLaMA: An Open Reproduction of LLaMA](https://github.com/openlm-research/open_llama)
	- Model: [openlm-research/open_llama_3b](https://huggingface.co/openlm-research/open_llama_3b)
	- [llama.cpp](https://github.com/ggerganov/llama.cpp): build 1012 (6381d4e) or later
	- [ggml version](https://huggingface.co/SlyEcho/open_llama_3b_ggml)

	## Newer quantizations

	There are now more quantization types in llama.cpp, some lower than 4 bits.
	Currently these are not supported, maybe because some weights have shapes that don't divide by 256.

	## Perplexity on wiki.test.406

	Coming soon...