cgus
/

Qwen2.5-14B-Instruct-abliterated-exl2

Text Generation

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

cgus commited on Nov 9, 2024

Commit

0ecfc49

·

verified ·

1 Parent(s): eed6429

Update README.md

Files changed (1) hide show

README.md +18 -2

README.md CHANGED Viewed

@@ -5,13 +5,29 @@ license_link: https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated/
 language:
 - en
 pipeline_tag: text-generation
-base_model: Qwen/Qwen2.5-14B-Instruct
 tags:
 - chat
 - abliterated
 - uncensored
 ---
 # huihui-ai/Qwen2.5-14B-Instruct-abliterated

 language:
 - en
 pipeline_tag: text-generation
+base_model: huihui-ai/Qwen2.5-14B-Instruct-abliterated
 tags:
 - chat
 - abliterated
 - uncensored
 ---
+# Qwen2.5-14B-Instruct-abliterated-exl2
+Model: [Qwen2.5-14B-Instruct-abliterated](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated)
+Made by: [huihui-ai](https://huggingface.co/huihui-ai)
+## Quants
+[4bpw h6 (main)](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/main)
+[4.5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/4.5bpw-h6)
+[5bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/5bpw-h6)
+[6bpw h6](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/6bpw-h6)
+[8bpw h8](https://huggingface.co/cgus/Qwen2.5-14B-Instruct-abliterated-exl2/tree/8bpw-h8)
+## Quantization notes
+Made with Exllamav2 0.2.3 with the default dataset. These quants are meant for modern RTX cards on Windows/Linux or AMD on Linux.
+The model have to fit the GPU to work properly. For example RTX3060/12GB should be able to load 4bpw with 16k context.
+It requires an app with Exllamav2 loader, such as Text-Generation-WebUI, TabbyAPI and some others.
+# Original model card
 # huihui-ai/Qwen2.5-14B-Instruct-abliterated