Files quantized with larger embed and output weights than normal GGUF setting

Q8_0 embed and output weights: Q6_K_L, Q5_K_L, Q4_K_L
bf16 embed and output weights (maybe slower inference): Q8_0_L, Q6_K_XL, Q5_K_XL, Q4_K_XL

GGUF

Model size

9.24B params

Architecture

gemma2

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API

Unable to determine this model's library. Check the docs .

Model tree for pipihand01/gemma-2-9b-it-SimPO-GGUF

Base model

Finetuned

Finetuned

Quantized

(29)

this model