Updated description and the metrics

d8e162a verified 10 months ago

588 Bytes

Official AQLM quantization of google/gemma-2b.

For this quantization, we used 2 codebooks of 8 bits.

Results:

Model	AQLM scheme	WinoGrande	PiQA	HellaSwag	ArcE	ArcC	Model size, Gb
gemma-2b	2x8	0.5801	0.6828	0.3891	0.5791	0.2534	1.6

To learn more about the inference, as well as the information on how to quantize models yourself, please refer to the official GitHub repo.