Update README.md
Browse files
README.md
CHANGED
@@ -62,10 +62,11 @@ print(tokenizer.decode(tokens[0], skip_special_tokens=True))
|
|
62 |
The dataset is comprised of a mixture of open datasets large-scale datasets available on the [HuggingFace Hub](https://huggingface.co/datasets):
|
63 |
- HuggingFaceH4/ultrachat_200k
|
64 |
- HuggingFaceH4/ultrafeedback_binarized
|
|
|
65 |
- meta-math/MetaMathQA
|
66 |
-
- Capybara
|
67 |
- Instruct Code Dataset (Internal)
|
68 |
- Wizard Dataset
|
|
|
69 |
|
70 |
### Training Procedure
|
71 |
|
@@ -77,7 +78,7 @@ The dataset is comprised of a mixture of open datasets large-scale datasets avai
|
|
77 |
|
78 |
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
79 |
|-------------|-----|----|---------------|--------------|
|
80 |
-
| **Stable Zephyr 3B** 🪁 | 3B | DPO | 6.
|
81 |
| Stable Zephyr (SFT only) | 3B | SFT | 7.12 | 71.15 |
|
82 |
| MPT-Chat | 7B |dSFT |5.42| -|
|
83 |
| Xwin-LMv0.1 | 7B| dPPO| 6.19| 87.83|
|
|
|
62 |
The dataset is comprised of a mixture of open datasets large-scale datasets available on the [HuggingFace Hub](https://huggingface.co/datasets):
|
63 |
- HuggingFaceH4/ultrachat_200k
|
64 |
- HuggingFaceH4/ultrafeedback_binarized
|
65 |
+
- Intel/orca_dpo_pairs
|
66 |
- meta-math/MetaMathQA
|
|
|
67 |
- Instruct Code Dataset (Internal)
|
68 |
- Wizard Dataset
|
69 |
+
- Open-Orca/SlimOrca
|
70 |
|
71 |
### Training Procedure
|
72 |
|
|
|
78 |
|
79 |
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
80 |
|-------------|-----|----|---------------|--------------|
|
81 |
+
| **Stable Zephyr 3B** 🪁 | 3B | DPO | 6.64 | 76.00 |
|
82 |
| Stable Zephyr (SFT only) | 3B | SFT | 7.12 | 71.15 |
|
83 |
| MPT-Chat | 7B |dSFT |5.42| -|
|
84 |
| Xwin-LMv0.1 | 7B| dPPO| 6.19| 87.83|
|