Llama2-7B-CPO / README.md
sabersaleh's picture
Create README.md
cfc39fd verified
|
raw
history blame
478 Bytes
metadata
license: mit
datasets:
  - tatsu-lab/alpaca
base_model:
  - meta-llama/Llama-2-7b

This model is aligned using the AlpacaFarm dataset, fine-tuned through the Contrastive Preference Optimization (CPO) loss. The alignment process started from the Supervised Fine-Tuned (SFT) version of LLaMA 2 7B. The optimization process was conducted with a single epoch. For more information on the dataset, refer to the AlpacaFarm documentation (https://github.com/tatsu-lab/alpaca_farm).