TheBloke
/

llama-2-70b-Guanaco-QLoRA-GPTQ

Text Classification

text-generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on Aug 8, 2023

Commit

ac92994

·

1 Parent(s): dd33981

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -38,7 +38,8 @@ Multiple GPTQ parameter permutations are provided; see Provided Files below for
 * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/llama-2-70b-Guanaco-QLoRA-GPTQ)
 * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/llama-2-70b-Guanaco-QLoRA-GGML)
-* [Original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/Mikael110/llama-2-70b-guanaco-qlora)
 ## Prompt template: Guanaco

 * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/llama-2-70b-Guanaco-QLoRA-GPTQ)
 * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/llama-2-70b-Guanaco-QLoRA-GGML)
+* [Merged fp16 model, for GPU inference and further conversions](https://huggingface.co/TheBloke/llama-2-70b-Guanaco-QLoRA-fp16)
+* [Mikael110's original QLoRA adapter](https://huggingface.co/Mikael110/llama-2-70b-guanaco-qlora)
 ## Prompt template: Guanaco