The IQ2_BN and IQ2_BN_R4 version of microsoft/bitnet-b1.58-2B-4T-gguf for use with ik_llama.cpp.

I recommend the IQ2_BN_R4 version but you use -rtr on IQ2_BN to convert on runtime.

The chat template in the model is incorrect (I did not change it, this is from the original Microsoft GGUF), will upload a fixed version later.

An example of correct usage:

<|begin_of_text|>User: Hey, are you conscious? Can you talk to me?<|eot_id|>Assistant:

GGUF

Model size

2.74B params

Architecture

bitnet-25

Hardware compatibility

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tdh111/bitnet-b1.58-2B-4T-GGUF

Base model

Quantized

(1)

this model