cgus commited on
Commit
0889bf6
·
verified ·
1 Parent(s): 740e43c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -20,10 +20,13 @@ Made by: [huihui-ai](https://huggingface.co/huihui-ai)
20
  [4.5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/4.5bpw-h6)
21
  [5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/5bpw-h6)
22
  [6bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/6bpw-h6)
 
23
 
24
  ## Quantization notes
25
- Made with Exllamav2 0.2.3 with the default dataset. These quants are meant for modern RTX cards on Windows/Linux or AMD on Linux.
26
- The model have to fit the GPU to work properly. For example RTX3060/12GB should be able to load 4bpw with 16k context.
 
 
27
  It requires an app with Exllamav2 loader, such as Text-Generation-WebUI, TabbyAPI and some others.
28
 
29
  # Original model card
 
20
  [4.5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/4.5bpw-h6)
21
  [5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/5bpw-h6)
22
  [6bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/6bpw-h6)
23
+ Didn't make 8bpw.
24
 
25
  ## Quantization notes
26
+ I accidentally made these quants and didn't finish 8bpw after noticing [v2 version](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2), that's why 8bpw quant is missing.
27
+
28
+ Made with Exllamav2 0.2.3 with the default dataset. These require modern RTX cards on Windows/Linux or AMD on Linux.
29
+ The model have to fit the GPU to work properly. For example RTX3060/12GB should be able to load 4.5-5bpw/Q6 cache and 16k context.
30
  It requires an app with Exllamav2 loader, such as Text-Generation-WebUI, TabbyAPI and some others.
31
 
32
  # Original model card