My GGUF-IQ-Imatrix quants for Nitral-AI/Hathor-L3-8B-v.01.

This model might still be a bit experimental.

"Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance."

Quantization process:
For future reference, these quants have been done after the fixes from #6920 have been merged.
Imatrix data was generated from the FP16-GGUF and conversions directly from the BF16-GGUF.
This was a bit more disk and compute intensive but hopefully avoided any losses during conversion.
If you noticed any issues let me know in the discussions.

General usage:
Use the latest version of KoboldCpp.
Remember that you can also use --flashattention on KoboldCpp now even with non-RTX cards for reduced VRAM usage.
For 8GB VRAM GPUs, I recommend the Q4_K_M-imat quant for up to 12288 context sizes.
For 12GB VRAM GPUs, the Q5_K_M-imat quant will give you a great size/quality balance.

Resources:
You can find out more about how each quant stacks up against each other and their types here and here, respectively.

Presets:
Some compatible SillyTavern presets can be found here (Hathor Presets) or here (Virt's Roleplay Presets).

Personal-support:
I apologize for disrupting your experience.
Currently I'm working on moving for a better internet provider.
If you want and you are able to...
You can spare some change over here (Ko-fi).

Author-support:
You can support the author at their own page.

image/webp

Original model text information:

"Hathor-v0.1 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction."

Recomended ST Presets: Hathor Presets


Notes: Hathor is trained on 3 epochs of private rp data, synthetic opus instructons, a mix of light/classical novel data. (Heavily wip)


  • If you want to use vision functionality:
  • You must use the latest versions of Koboldcpp.
  • To use the multimodal capabilities of this model and use vision you need to load the specified mmproj file, this can be found inside this model repo. Llava MMProj
  • You can load the mmproj by using the corresponding section in the interface:

image/png

Downloads last month
154
GGUF
Model size
8.03B params
Architecture
llama

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .

Collection including Lewdiculous/Hathor-L3-8B-v.01-GGUF-IQ-Imatrix