elysiantech
/

gemma-2b-gptq-4bit

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

elysiantech commited on Jul 6, 2024

Commit

064a1a7

•

1 Parent(s): eea595e

Update README.md

Files changed (1) hide show

README.md +10 -17

README.md CHANGED Viewed

@@ -7,6 +7,9 @@ license_name: gemma-terms-of-use
 license_link: https://ai.google.dev/gemma/terms
 tags:
 - text-generation-inference
 extra_gated_heading: Access Gemma on Hugging Face
 extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
   agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging
@@ -14,24 +17,14 @@ extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review
 extra_gated_button_content: Acknowledge license
 ---
-GPTQ quantized version of gemma-2b model.
----
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1DTT1F5dV2wpZEU3epLTWL4WaOUjC6KAN?usp=sharing)
-# Gemma Model Card
-**Model Page**: [Gemma](https://ai.google.dev/gemma/docs)
-This model card corresponds to the 2B base version of the Gemma model. You can also visit the model card of the [2B base model](https://huggingface.co/google/gemma-2b)
-**Resources and Technical Documentation**:
-* [Responsible Generative AI Toolkit](https://ai.google.dev/responsible)
-* [Gemma on Kaggle](https://www.kaggle.com/models/google/gemma)
-* [Gemma on Vertex Model Garden](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335?version=gemma-7b-gg-hf)
-**Terms of Use**: [Terms](https://www.kaggle.com/models/google/gemma/license/consent)
-**Authors**: Google

 license_link: https://ai.google.dev/gemma/terms
 tags:
 - text-generation-inference
+- gemma
+- gptq
+- google
 extra_gated_heading: Access Gemma on Hugging Face
 extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
   agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging
 extra_gated_button_content: Acknowledge license
 ---
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kFznlPlWYOrcgd7Q1NI2tYMLH_vTRuys?usp=sharing)
+# elysiantech/gemma-2b-gptq-4bit
+gemma-2b-gptq-4bit is a version of the [2B base model](https://huggingface.co/google/gemma-2b) model that was quantized using the GPTQ method developed by [Lin et al. (2023)](https://arxiv.org/abs/2308.07662).
+Please refer to the [Original Gemma Model Card](https://ai.google.dev/gemma/docs) for details about the model preparation and training processes.
+## Dependencies
+- [`auto-gptq](https://pypi.org/project/auto-gptq/0.7.1/) – [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ.git) was used to quantize the phi-3 model.
+- [`vllm==0.4.2`](https://pypi.org/project/vllm/0.4.2/) – [vLLM](https://github.com/vllm-project/vllm) was used to host models for benchmarking.