Update README.md
Browse files
README.md
CHANGED
@@ -49,9 +49,11 @@ Transformers version 4.33.0 is required.
|
|
49 |
|
50 |
Due to the huge size of the model, the GPTQ has been sharded. This will break compatibility with AutoGPTQ, and therefore any clients/libraries that use AutoGPTQ directly.
|
51 |
|
52 |
-
But they work great
|
53 |
|
54 |
-
|
|
|
|
|
55 |
- Transformers 4.33.0
|
56 |
- [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) version 1.0.4
|
57 |
- Docker container: `ghcr.io/huggingface/text-generation-inference:latest`
|
@@ -72,7 +74,6 @@ Currently these GPTQs are tested to work with:
|
|
72 |
```
|
73 |
User: {prompt}
|
74 |
Assistant:
|
75 |
-
|
76 |
```
|
77 |
|
78 |
<!-- prompt-template end -->
|
|
|
49 |
|
50 |
Due to the huge size of the model, the GPTQ has been sharded. This will break compatibility with AutoGPTQ, and therefore any clients/libraries that use AutoGPTQ directly.
|
51 |
|
52 |
+
But they work great loaded directly through Transformers - and can be served using Text Generation Inference!
|
53 |
|
54 |
+
## Compatibility
|
55 |
+
|
56 |
+
Currently these GPTQs are known to work with:
|
57 |
- Transformers 4.33.0
|
58 |
- [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) version 1.0.4
|
59 |
- Docker container: `ghcr.io/huggingface/text-generation-inference:latest`
|
|
|
74 |
```
|
75 |
User: {prompt}
|
76 |
Assistant:
|
|
|
77 |
```
|
78 |
|
79 |
<!-- prompt-template end -->
|