CausalLM
/

14B-DPO-alpha

Text Generation

text-generation-inference

Model card Files Files and versions

JosephusCheung commited on Dec 1, 2023

Commit

254116c

·

1 Parent(s): f45584c

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -46,6 +46,11 @@ For details, please refer to the version without DPO training: [CausalLM/14B](ht
 | **CausalLM/14B-DPO-α**    | **7.618868** |
 | **CausalLM/7B-DPO-α**     | **7.038125** |
 It should be noted that this is not a version that continues training on CausalLM/14B & 7B, but rather an optimized version that has undergone DPO training concurrently on a previous training branch, and some detailed parameters may have changed. You will still need to download the full model.
 The beta branch will soon be released, employing some aggressive approaches that might be detrimental in certain tasks, in order to achieve better alignment with human preferences, aiming to meet or exceed the GPT-3.5 benchmarks. Stay tuned.

 | **CausalLM/14B-DPO-α**    | **7.618868** |
 | **CausalLM/7B-DPO-α**     | **7.038125** |
+Dec 2, 2023
+Rank **#2** non-base model, of its size on 🤗 Open LLM Leaderboard, outperforms ALL ~13B chat models including microsoft/Orca-2-13b.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/63468a143ea42ee2cb49ddd1/Df5BcU3Pxzt2oKjuzArvk.png)
 It should be noted that this is not a version that continues training on CausalLM/14B & 7B, but rather an optimized version that has undergone DPO training concurrently on a previous training branch, and some detailed parameters may have changed. You will still need to download the full model.
 The beta branch will soon be released, employing some aggressive approaches that might be detrimental in certain tasks, in order to achieve better alignment with human preferences, aiming to meet or exceed the GPT-3.5 benchmarks. Stay tuned.