Llama-2-13b-chat-hf - bnb 4bit

Description

This model is 4bit quantized version of Llama-2-13b-chat-hf using bitsandbytes. It's designed for fine-tuning! The PAD token is set as UNK.

Downloads last month
18
Safetensors
Model size
6.87B params
Tensor type
F32
FP16
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for itsanurag/Llama-2-13b-Chat-4BitQuantized

Quantized
(20)
this model