The IQ2_BN and IQ2_BN_R4 version of microsoft/bitnet-b1.58-2B-4T-gguf for use with ik_llama.cpp.

I recommend the IQ2_BN_R4 version but you use -rtr on IQ2_BN to convert on runtime.

The chat template in the model is incorrect (I did not change it, this is from the original Microsoft GGUF), will upload a fixed version later.

An example of correct usage:

<|begin_of_text|>User: Hey, are you conscious? Can you talk to me?<|eot_id|>Assistant:

Downloads last month
13
GGUF
Model size
2.74B params
Architecture
bitnet-25
Hardware compatibility
Log In to view the estimation
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tdh111/bitnet-b1.58-2B-4T-GGUF

Quantized
(1)
this model