language:
- en
license: apache-2.0
tags:
- human feedback
- rlhf
- preferences
- alignment
- HALO
- halos
- dpo
- rl
datasets:
- snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
metrics:
- accuracy
This repo contains the model and tokenizer checkpoints for:
- model family mistralai/Mistral-7B-Instruct-v0.2
- optimized with the loss KTO
- aligned using the snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
- via 3 iterations of KTO on one epoch of each training partition, each previous iteration's model serving as the reference for the subsequent.
[03/06/2024]: We are #2 on the (verified) Alpaca Eval 2.0 Leaderboard scoring 33.23!
To prompt this model, ensure that the format is consistent with that of TuluV2.
For example, a prompt should be formatted as follows, where <|user|>
corresponds to the human's role and <|assistant|>
corresponds to the LLM's role.
The human should speak first:
<|user|>
Hi! I'm looking for a cake recipe.
<|assistant|>
What kind of cake?
<|user|>
Chocolate cake.
<|assistant|>
Note that a beginning-of-sequence (BOS) token is automatically added at tokenization time and does not have to be added by you. No end-of-sequence (EOS) token is added to the prompt.
You may also use our tokenizer's apply_chat_template
if doing inference with chatml
set or evaluating generations through non-local clients.
Please refer to our code repository or blog for more details on the methodology.
If you found this work useful, feel free to cite our work:
@techreport{ethayarajh2023halos,
author = {Ethayarajh, Kawin and Xu, Winnie, and Jurafsky, Dan and Kiela, Douwe},
title = {Human-Centered Loss Functions (HALOs)},
institution = {Contextual AI},
note = {https://github.com/ContextualAI/HALOs/blob/main/assets/report.pdf},
year = {2023},
}