Open the model settings and change the prompt template to ChatML.
I found that mixing GPU offload values results in gibberish also, so you can try offloading 100% to the GPU.
· Sign up or log in to comment