ContextualAI
/

Contextual_KTO_Mistral_PairRM

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

xwinxu commited on Mar 7, 2024

Commit

8d0fec9

·

verified ·

1 Parent(s): 8b7e5cc

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -19,8 +19,8 @@ metrics:
 This repo contains the model and tokenizer checkpoints for:
-- model family <b>mistralai/Mistral-7B-Instruct-v0.2</b>
-- optimized with the loss <b>KTO</b>
 - aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
 - via 3 iterations of KTO on one epoch of each training partition.
@@ -42,7 +42,7 @@ You may also use our tokenizer to `apply_chat_template` if doing inference with
-Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) for more information on the methodology.
 If you found this work useful, feel free to cite [our work](https://arxiv.org/abs/2402.01306):
 ```

 This repo contains the model and tokenizer checkpoints for:
+- model family [<b>mistralai/Mistral-7B-Instruct-v0.2</b>](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
+- optimized with the loss [<b>KTO</b>](https://twitter.com/winniethexu/status/1732839295365554643)
 - aligned using the [snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset](https://huggingface.co/datasets/snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset)
 - via 3 iterations of KTO on one epoch of each training partition.
+Please refer to our [code repository](https://github.com/ContextualAI/HALOs) or [blog](https://contextual.ai/better-cheaper-faster-llm-alignment-with-kto/) for more details on the methodology.
 If you found this work useful, feel free to cite [our work](https://arxiv.org/abs/2402.01306):
 ```