TheBloke
/

Falcon-180B-Chat-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on Sep 7, 2023

Commit

e68657a

•

1 Parent(s): 2e0f5aa

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -49,9 +49,11 @@ Transformers version 4.33.0 is required.
 Due to the huge size of the model, the GPTQ has been sharded. This will break compatibility with AutoGPTQ, and therefore any clients/libraries that use AutoGPTQ directly.
-But they work great direct from Transformers!
-Currently these GPTQs are tested to work with:
 - Transformers 4.33.0
 - [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) version 1.0.4
   - Docker container: `ghcr.io/huggingface/text-generation-inference:latest`
@@ -72,7 +74,6 @@ Currently these GPTQs are tested to work with:
 ```
 User: {prompt}
 Assistant:
 ```
 <!-- prompt-template end -->

 Due to the huge size of the model, the GPTQ has been sharded. This will break compatibility with AutoGPTQ, and therefore any clients/libraries that use AutoGPTQ directly.
+But they work great loaded directly through Transformers - and can be served using Text Generation Inference!
+## Compatibility
+Currently these GPTQs are known to work with:
 - Transformers 4.33.0
 - [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) version 1.0.4
   - Docker container: `ghcr.io/huggingface/text-generation-inference:latest`
 ```
 User: {prompt}
 Assistant:
 ```
 <!-- prompt-template end -->