LM Studio vs llama.cpp different results?
I tested the QwQ-32B-Q3_K_M.gguf in clients that i mentioned. LM Studio just got updated, so i just run the model without making any fixes (temp 0.6 of course, rep.p 1). And it seems that in LM Studio the model performs better?
I asked same question SEVERAL times in LM Studio and llama.cpp (llama cli in command line), and in llama.cpp the model is thinking too much and "goes in the wrong direction" so to say, so the final answer is not the best. While in LM Studio it does everything right.
Maybe someone can provide correct settings for llama.cpp? Mine are:
llama-cli -m QwQ-32B-Q3_K_M.gguf -c 7000 -ngl 65 --multiline-input --conversation --color --temp 0.6 --no-mmap --mlock
So basically default. What is missing?
Could you try using our suggested settings in llama.cpp:
./llama.cpp/llama-cli \
--model unsloth-QwQ-32B-GGUF/QwQ-32B-Q4_K_M.gguf \
--threads 32 \
--ctx-size 16384 \
--n-gpu-layers 99 \
--seed 3407 \
--prio 2 \
--temp 0.6 \
--repeat-penalty 1.1 \
--dry-multiplier 0.5 \
--min-p 0.01 \
--top-k 40 \
--top-p 0.95 \
-no-cnv \
--samplers "top_k;top_p;min_p;temperature;dry;typ_p;xtc" \
--prompt "<|im_start|>user\nCreate a Flappy Bird game in Python."
See our guide here for more info btw: https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-effectively
I know of course... And i tried those settings too. But for now, i feel like leaving everything at default in LM Studio (only using temp 0.6 and rep.p 1) works perfectly...
Can you believe, i'm using your QwQ-32B-Q3_K_M.gguf quant, and it seems to work as if it was Q5_km or something...
I gave it super hard promt and it worked! >>>
"Create Mandelbrot fractal simulator with Earth like colors and rendered on a 3D sphere. Use html, css, javascript, single file. Allow movements using mouse dragging and infinite zoom using mouse wheel."
With my super slow hardware it thought almost for an HOUR (10000 tokens), but gave me working code.
And this is not a single example, i tried different hard promts, and it seems super intelligent, ONLY Q3_KM!