Bielik-11B-v2.2-Instruct-Quanto-8bit

This model was converted to Quanto format from SpeakLeash's Bielik-11B-v.2.2-Instruct.

DISCLAIMER: Be aware that quantised models show reduced response quality and possible hallucinations!

About Quanto

Optimum Quanto is a pytorch quantization backend for optimum. Model can be loaded using:

from optimum.quanto import QuantizedModelForCausalLM

qmodel = QuantizedModelForCausalLM.from_pretrained('speakleash/Bielik-11B-v2.2-Instruct-Quanto-8bit')

Model description:

Responsible for model quantization

  • Remigiusz KinasSpeakLeash - team leadership, conceptualizing, calibration data preparation, process creation and quantized model delivery.

Contact Us

If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our Discord SpeakLeash.

Downloads last month
27
Safetensors
Model size
11.2B params
Tensor type
F32
·
I8
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for speakleash/Bielik-11B-v2.2-Instruct-Quanto-8bit

Finetuned
(9)
this model

Collection including speakleash/Bielik-11B-v2.2-Instruct-Quanto-8bit