TheBloke
/

alpaca-lora-65B-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on Apr 23, 2023

Commit

43bbcdf

•

1 Parent(s): 56764bb

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -7,6 +7,8 @@ inference: false
 This is a [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) 4bit quantisation of [changsung's alpaca-lora-65B](https://huggingface.co/chansung/alpaca-lora-65b)
 ## These files need a lot of VRAM!
 I believe they will work on 2 x 24GB cards, and I hope that at least the 1024g file will work on an A100 40GB.

 This is a [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) 4bit quantisation of [changsung's alpaca-lora-65B](https://huggingface.co/chansung/alpaca-lora-65b)
+I also have 4bit and 2bit GGML files for cPU inference available here: [TheBloke/alpaca-lora-65B-GGML](https://huggingface.co/TheBloke/alpaca-lora-65B-GGML).
 ## These files need a lot of VRAM!
 I believe they will work on 2 x 24GB cards, and I hope that at least the 1024g file will work on an A100 40GB.