ecastera/gemma-2-2b-it-q2_k.gguf

Quantization of gemma-2-9b-it-abliterated for edge devices 4.9Gb footprint
Original model: https://huggingface.co/IlyaGusev/gemma-2-9b-it-abliterated
All quants made using imatrix option with dataset from here
Using llama.cpp compiled with CUDA support for quantization and inference:

ggml_cuda_init: found 2 CUDA devices: Device 0: NVIDIA GeForce RTX 4060 Ti, compute capability 8.9, VMM: yes Device 1: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes version: 3982 (cc2983d3) built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu