Update README.md
Browse files
README.md
CHANGED
@@ -20,10 +20,13 @@ Made by: [huihui-ai](https://huggingface.co/huihui-ai)
|
|
20 |
[4.5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/4.5bpw-h6)
|
21 |
[5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/5bpw-h6)
|
22 |
[6bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/6bpw-h6)
|
|
|
23 |
|
24 |
## Quantization notes
|
25 |
-
|
26 |
-
|
|
|
|
|
27 |
It requires an app with Exllamav2 loader, such as Text-Generation-WebUI, TabbyAPI and some others.
|
28 |
|
29 |
# Original model card
|
|
|
20 |
[4.5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/4.5bpw-h6)
|
21 |
[5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/5bpw-h6)
|
22 |
[6bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/6bpw-h6)
|
23 |
+
Didn't make 8bpw.
|
24 |
|
25 |
## Quantization notes
|
26 |
+
I accidentally made these quants and didn't finish 8bpw after noticing [v2 version](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2), that's why 8bpw quant is missing.
|
27 |
+
|
28 |
+
Made with Exllamav2 0.2.3 with the default dataset. These require modern RTX cards on Windows/Linux or AMD on Linux.
|
29 |
+
The model have to fit the GPU to work properly. For example RTX3060/12GB should be able to load 4.5-5bpw/Q6 cache and 16k context.
|
30 |
It requires an app with Exllamav2 loader, such as Text-Generation-WebUI, TabbyAPI and some others.
|
31 |
|
32 |
# Original model card
|