elysiantech
/

gemma-2b-gptq-4bit

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

elysiantech commited on Jul 6

Commit

3459fe9

•

1 Parent(s): f2b33fb

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -26,5 +26,5 @@ gemma-2b-gptq-4bit is a version of the [2B base model](https://huggingface.co/go
 Please refer to the [Original Gemma Model Card](https://ai.google.dev/gemma/docs) for details about the model preparation and training processes.
 ## Dependencies
-- [`auto-gptq'](https://pypi.org/project/auto-gptq/0.7.1/) – [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ.git) was used to quantize the phi-3 model.
 - [`vllm==0.4.2`](https://pypi.org/project/vllm/0.4.2/) – [vLLM](https://github.com/vllm-project/vllm) was used to host models for benchmarking.

 Please refer to the [Original Gemma Model Card](https://ai.google.dev/gemma/docs) for details about the model preparation and training processes.
 ## Dependencies
+- [`auto-gptq`](https://pypi.org/project/auto-gptq/0.7.1/) – [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ.git) was used to quantize the phi-3 model.
 - [`vllm==0.4.2`](https://pypi.org/project/vllm/0.4.2/) – [vLLM](https://github.com/vllm-project/vllm) was used to host models for benchmarking.