elysiantech
/

phi-3-mini-4k-instruct-gptq-4bit

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

phi-3-mini-4k-instruct-gptq-4bit

phi-3-mini-4k-instruct-gptq-4bit is a version of the Microsoft Phi 3 mini 4k Instruct model that was quantized using the GPTQ method developed by Lin et al. (2023).

Please refer to the Original Phi 3 mini model card for details about the model preparation and training processes.

Dependencies

auto-gptq – AutoGPTQ was used to quantize the phi-3 model.
vllm==0.4.2 – vLLM was used to host models for benchmarking.

Downloads last month: 38

Safetensors

Model size

0.7B params

Tensor type

I32

·

F16

·