recommended generation parameters

#5
by erichartford - opened

https://huggingface.co/Qwen/QwQ-32B/blob/main/generation_config.json

Is this your recommendation?
"repetition_penalty": 1.0,
"temperature": 0.6,
"top_k": 40,
"top_p": 0.95,

Nah, the model thinks for way longer on neutral settings(everything set to one or zero). Trying these resulted in messy results and stray tokens.
Edit: yeah, that was a one off, bad conclusions

yeah I notice much too long thinking, and second guessing.
I am asking here for the recommended generation parameters.

But that actually seems to be their recommendation as it also stated in the README:

Sampling Parameters:
Use Temperature=0.6 and TopP=0.95 instead of Greedy decoding to avoid endless repetitions.
Use TopK between 20 and 40 to filter out rare token occurrences while maintaining the diversity of the generated output.

erichartford changed discussion status to closed

You are welcome @erichartford

I wrote up more details on https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-effectively for other settings I found to be effective + it stops infinite generations!

Sign up or log in to comment