Question: did you use beta=0.1?

#1
by eengad - opened

(default in alignment handbook).

BTW I ran MT-bench and got:
gemma-2b-zephyr-dpo 4.347826
gemma-2b-zephyr-sft 4.215625

Weights and Biases org
edited Mar 12

Here is the run: https://wandb.ai/llm_surgery/gemma-zephyr/runs/lbqi9kvq
nope, beta=0.01. I think the default is 0.05 in the new recipe

tcapelle changed discussion status to closed
tcapelle changed discussion status to open
Weights and Biases org

Sign up or log in to comment