Can this be quantized with bitsAndBytes?

by Permahuman - opened 7 days ago

7 days ago

Atm it's a huge safetensor model and gguf llama.cpp support likely won't be incoming for quite some time. Can we use bitsAndBytes to quantize this and use it at 4bit? And will it still he functional?

ulymp

7 days ago

Just naively tried and it won't work.

Permahuman

7 days ago

I wonder how in the world to even diagnose how to get this quantized. This model could be a huge game changer if we can get it's size down to 4bit. Qwen is already balling but this would put a cherry on top of their ice cream heap.

Permahuman

7 days ago

Just naively tried and it won't work.

Thank you for trying. I am reluctant to download it and try to quantize it myswlf as it will take over 12 hours for my slow internet connection to aquire the model. Cheers.

CHNtentes

7 days ago

is bnb even that good? i tried load in 4bit with llama3 and the quality is clearly worse than gptq/awq/gguf...

Permahuman

7 days ago

is bnb even that good? i tried load in 4bit with llama3 and the quality is clearly worse than gptq/awq/gguf...

I dont know honestly I just thought it would be a much quicker route to quantization than waiting for llama.cpp to implement ggguf support. Maybe awq or gptq would be faster than llama.cpp repo?

LeroyDyer

5 days ago

is bnb even that good? i tried load in 4bit with llama3 and the quality is clearly worse than gptq/awq/gguf...

I dont know honestly I just thought it would be a much quicker route to quantization than waiting for llama.cpp to implement ggguf support. Maybe awq or gptq would be faster than llama.cpp repo?

lol !

LeroyDyer

5 days ago

Just naively tried and it won't work.

it wouiild need to use the llamacpp surgery
a seach component needs to be extracted first :
same a llava !
the image processor and then the audio processor .. either as bin files then to gguf so in the end there woudl be three ggufs !
one for image and one for audio and the other for llm !
like the llava models !

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment