disi-unibo-nlp
/

pmc-llama-13b-awq

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

alecocc commited on Feb 23

Commit

c69c8f9

•

1 Parent(s): 5df075b

Update README.md

Files changed (1) hide show

README.md +12 -1

README.md CHANGED Viewed

@@ -2,4 +2,15 @@
 license: openrail
 model_creator: axiong
 model_name: PMC_LLaMA_13B
----

 license: openrail
 model_creator: axiong
 model_name: PMC_LLaMA_13B
+---
+# PMC-LLaMA-13B - AWQ
+- Model creator: [axiong](https://huggingface.co/axiong)
+- Original model: [PMC_LLaMA_13B](https://huggingface.co/axiong/PMC_LLaMA_13B)
+## Description
+This repo contains AWQ model files for [PMC_LLaMA_13B](https://huggingface.co/axiong/PMC_LLaMA_13B).
+### About AWQ
+AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.