recommended generation parameters

by erichartford - opened 4 days ago

Discussion

erichartford

4 days ago

https://huggingface.co/Qwen/QwQ-32B/blob/main/generation_config.json

Is this your recommendation?
"repetition_penalty": 1.0,
"temperature": 0.6,
"top_k": 40,
"top_p": 0.95,

Olafangensan

4 days ago

•

edited 3 days ago

Nah, the model thinks for way longer on neutral settings(everything set to one or zero). Trying these resulted in messy results and stray tokens.
Edit: yeah, that was a one off, bad conclusions

erichartford

4 days ago

yeah I notice much too long thinking, and second guessing.
I am asking here for the recommended generation parameters.

owao

4 days ago

But that actually seems to be their recommendation as it also stated in the README:

Sampling Parameters:
Use Temperature=0.6 and TopP=0.95 instead of Greedy decoding to avoid endless repetitions.
Use TopK between 20 and 40 to filter out rare token occurrences while maintaining the diversity of the generated output.

erichartford changed discussion status to closed 4 days ago

owao

4 days ago

You are welcome @erichartford

danielhanchen

2 days ago

I wrote up more details on https://docs.unsloth.ai/basics/tutorial-how-to-run-qwq-32b-effectively for other settings I found to be effective + it stops infinite generations!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment