converted gguf format model is so slow on inference( is that right?)

by bangbang - opened Dec 29, 2023

Dec 29, 2023

I use KoLLaVA-Synatra-7b by converting gguf format. that gguf model so slow... that i thought i coludn't use this. (못쓸정도로 느립니다.)

I want you to tell me this model slow is true????

Mineru

Apr 12, 2024

How did you quantize it? like Q8_0, Q 4_K_M

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment