README.md · ContextualAI/Contextual_KTO_Mistral_PairRM at bdf7fe0202e81a9409ae92eada6804efa205d061

metadata

language:
  - en
license: apache-2.0
tags:
  - human feedback
  - rlhf
  - preferences
  - alignment
  - HALO
  - halos
  - dpo
  - rl
datasets:
  - snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
metrics:
  - accuracy

This repo contains the model and tokenizer checkpoints for:

model family mistralai/Mistral-7B-Instruct-v0.2
optimized with the loss KTO
aligned using the snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
via 3 iterations of KTO on one epoch of each training partition, each previous iteration's model serving as the reference for the subsequent.

[03/06/2024]: We are #2 on the (verified) Alpaca Eval 2.0 Leaderboard scoring 33.23!

To prompt this model, ensure that the format is consistent with that of TuluV2. For example, a prompt should be formatted as follows, where <|user|> corresponds to the human's role and <|assistant|> corresponds to the LLM's role. The human should speak first:


<|user|>
Hi! I'm looking for a cake recipe.
<|assistant|>
What kind of cake?
<|user|>
Chocolate cake.
<|assistant|>

Note that a beginning-of-sequence (BOS) token is automatically added at tokenization time and does not have to be added by you. No end-of-sequence (EOS) token is added to the prompt. You may also use our tokenizer's apply_chat_template if doing inference with chatml set or evaluating generations through non-local clients.

Please refer to our code repository or blog for more details on the methodology.

If you found this work useful, feel free to cite our work:

@techreport{ethayarajh2023halos,
  author = {Ethayarajh, Kawin and Xu, Winnie, and Jurafsky, Dan and Kiela, Douwe},
  title = {Human-Centered Loss Functions (HALOs)},
  institution = {Contextual AI},
  note = {https://github.com/ContextualAI/HALOs/blob/main/assets/report.pdf},
  year = {2023},
}