bartowski
/

Meta-Llama-3-8B-Instruct-GGUF-old

Text Generation

Inference Endpoints

Model card Files Files and versions Community

bartowski commited on Apr 21

Commit

4eb986c

•

1 Parent(s): a99cc66

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -190,6 +190,8 @@ quantized_by: bartowski
 ## Llamacpp iMatrix Quantizations of Meta-Llama-3-8B-Instruct
 This model has the <|eot_id|> token set to not-special, which seems to work better with current inference engines.
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> fork from pcuenca <a href="https://github.com/pcuenca/llama.cpp/tree/llama3-conversion">llama3-conversion</a> for quantization.

 ## Llamacpp iMatrix Quantizations of Meta-Llama-3-8B-Instruct
+<b>Now that the official release supporting Llama 3 is out [here](https://github.com/ggerganov/llama.cpp/releases/tag/b2710), this will be tagged "-old" and new quants will be made with no changes to configuration</b>
 This model has the <|eot_id|> token set to not-special, which seems to work better with current inference engines.
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> fork from pcuenca <a href="https://github.com/pcuenca/llama.cpp/tree/llama3-conversion">llama3-conversion</a> for quantization.