GGUF
Not-For-All-Audiences
nsfw
Inference Endpoints

Exl2

#1
by SekkSea - opened

Any chance for a exl2 variant of this without the GGUF? I have been locked at 13b, and would love to try a 3-bit variant of a 20b model.

Any chance for a exl2 variant of this without the GGUF? I have been locked at 13b, and would love to try a 3-bit variant of a 20b model.

I'm actually trying to do a 4bit h6 for the Inverted one, but it's very long and I didn't succeeded in 3 days to make an EXL2 lmao.
If I achieve that, I will try your request.

What is the model loader for this?

What is the model loader for this?

Llama.cpp for GGUF
ExLlama 2 for EXL2

Sign up or log in to comment