Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ This repo contains the model and tokenizer checkpoints for:
|
|
21 |
- model family [<b>mistralai/Mistral-7B-Instruct-v0.2</b>](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
|
22 |
- optimized with the loss [<b>KTO</b>](https://twitter.com/winniethexu/status/1732839295365554643)
|
23 |
- aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
|
24 |
-
- via 3 iterations of KTO on one epoch of each training partition, each previous iteration's model serving as the reference for the
|
25 |
|
26 |
**[03/06/2024]**: We are #2 on the (verified) [Alpaca Eval 2.0 Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) scoring **33.23**!
|
27 |
|
@@ -38,9 +38,8 @@ What kind of cake?
|
|
38 |
Chocolate cake.
|
39 |
<|assistant|>
|
40 |
```
|
41 |
-
Note that a beginning-of-sequence (BOS) token automatically added at tokenization and does not have to be added by you. No end-of-sequence (EOS) token is added to the prompt.
|
42 |
-
You may also use our tokenizer
|
43 |
-
|
44 |
|
45 |
|
46 |
Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) for more details on the methodology.
|
|
|
21 |
- model family [<b>mistralai/Mistral-7B-Instruct-v0.2</b>](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
|
22 |
- optimized with the loss [<b>KTO</b>](https://twitter.com/winniethexu/status/1732839295365554643)
|
23 |
- aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
|
24 |
+
- via 3 iterations of KTO on one epoch of each training partition, each previous iteration's model serving as the reference for the subsequent.
|
25 |
|
26 |
**[03/06/2024]**: We are #2 on the (verified) [Alpaca Eval 2.0 Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) scoring **33.23**!
|
27 |
|
|
|
38 |
Chocolate cake.
|
39 |
<|assistant|>
|
40 |
```
|
41 |
+
Note that a beginning-of-sequence (BOS) token is automatically added at tokenization time and does not have to be added by you. No end-of-sequence (EOS) token is added to the prompt.
|
42 |
+
You may also use our tokenizer's `apply_chat_template` if doing inference with `chatml` set or evaluating generations through non-local clients.
|
|
|
43 |
|
44 |
|
45 |
Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) for more details on the methodology.
|