xwinxu commited on
Commit
8d0fec9
·
verified ·
1 Parent(s): 8b7e5cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -19,8 +19,8 @@ metrics:
19
 
20
 
21
  This repo contains the model and tokenizer checkpoints for:
22
- - model family <b>mistralai/Mistral-7B-Instruct-v0.2</b>
23
- - optimized with the loss <b>KTO</b>
24
  - aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
25
  - via 3 iterations of KTO on one epoch of each training partition.
26
 
@@ -42,7 +42,7 @@ You may also use our tokenizer to `apply_chat_template` if doing inference with
42
 
43
 
44
 
45
- Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) for more information on the methodology.
46
 
47
  If you found this work useful, feel free to cite [our work](https://arxiv.org/abs/2402.01306):
48
  ```
 
19
 
20
 
21
  This repo contains the model and tokenizer checkpoints for:
22
+ - model family [<b>mistralai/Mistral-7B-Instruct-v0.2</b>](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
23
+ - optimized with the loss [<b>KTO</b>](https://twitter.com/winniethexu/status/1732839295365554643)
24
  - aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
25
  - via 3 iterations of KTO on one epoch of each training partition.
26
 
 
42
 
43
 
44
 
45
+ Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) for more details on the methodology.
46
 
47
  If you found this work useful, feel free to cite [our work](https://arxiv.org/abs/2402.01306):
48
  ```