My GGUF-IQ-Imatrix quants for Nitral-AI/Hathor-L3-8B-v.01.

This model might still be a bit experimental.

"Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance."

Quantization process:
For future reference, these quants have been done after the fixes from #6920 have been merged.
Imatrix data was generated from the FP16-GGUF and conversions directly from the BF16-GGUF.
This was a bit more disk and compute intensive but hopefully avoided any losses during conversion.
If you noticed any issues let me know in the discussions.

General usage:
Use the latest version of KoboldCpp.
Remember that you can also use --flashattention on KoboldCpp now even with non-RTX cards for reduced VRAM usage.
For 8GB VRAM GPUs, I recommend the Q4_K_M-imat quant for up to 12288 context sizes.
For 12GB VRAM GPUs, the Q5_K_M-imat quant will give you a great size/quality balance.

Resources:
You can find out more about how each quant stacks up against each other and their types here and here, respectively.

Presets:
Some compatible SillyTavern presets can be found here (Hathor Presets) or here (Virt's Roleplay Presets).

Personal-support:
I apologize for disrupting your experience.
Currently I'm working on moving for a better internet provider.
If you want and you are able to...
You can spare some change over here (Ko-fi).

Author-support:
You can support the author at their own page.

Original model text information:

"Hathor-v0.1 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction."

Recomended ST Presets: Hathor Presets

Notes: Hathor is trained on 3 epochs of private rp data, synthetic opus instructons, a mix of light/classical novel data. (Heavily wip)

If you want to use vision functionality:

You must use the latest versions of Koboldcpp.

To use the multimodal capabilities of this model and use vision you need to load the specified mmproj file, this can be found inside this model repo. Llava MMProj

You can load the mmproj by using the corresponding section in the interface:

Lewdiculous
/

Hathor-L3-8B-v.01-GGUF-IQ-Imatrix

Original model text information:

"Hathor-v0.1 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction."

Recomended ST Presets: Hathor Presets

Notes: Hathor is trained on 3 epochs of private rp data, synthetic opus instructons, a mix of light/classical novel data. (Heavily wip)

Collection including Lewdiculous/Hathor-L3-8B-v.01-GGUF-IQ-Imatrix

Quantized Models (GGUF, IQ, Imatrix)