Can you produce a 2.4bpw quantization of this model?

#1
by xldistance - opened

rtx4090 24gb video memory can only run 2.4bpw quantized models

xldistance changed discussion title from Can you produce a 2.34bpw quantization of this model? to Can you produce a 2.4bpw quantization of this model?

I've posted a 2.75, 2.5 and 2.25 for Athene. I'm running perplexity scoring now and will update the README's with those scores when they're done.

Perplexity scores have now also been added.

Dracones changed discussion status to closed

Sign up or log in to comment