Update README.md
Browse files
README.md
CHANGED
@@ -7,6 +7,8 @@ inference: false
|
|
7 |
|
8 |
This is a [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) 4bit quantisation of [changsung's alpaca-lora-65B](https://huggingface.co/chansung/alpaca-lora-65b)
|
9 |
|
|
|
|
|
10 |
## These files need a lot of VRAM!
|
11 |
|
12 |
I believe they will work on 2 x 24GB cards, and I hope that at least the 1024g file will work on an A100 40GB.
|
|
|
7 |
|
8 |
This is a [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) 4bit quantisation of [changsung's alpaca-lora-65B](https://huggingface.co/chansung/alpaca-lora-65b)
|
9 |
|
10 |
+
I also have 4bit and 2bit GGML files for cPU inference available here: [TheBloke/alpaca-lora-65B-GGML](https://huggingface.co/TheBloke/alpaca-lora-65B-GGML).
|
11 |
+
|
12 |
## These files need a lot of VRAM!
|
13 |
|
14 |
I believe they will work on 2 x 24GB cards, and I hope that at least the 1024g file will work on an A100 40GB.
|