sabersaleh
/

Llama2-7B-CPO

Model card Files Files and versions Community

sabersaleh commited on Nov 30, 2024

Commit

cfc39fd

•

1 Parent(s): 8f1d9af

Create README.md

Files changed (1) hide show

README.md +9 -0

README.md ADDED Viewed

	@@ -0,0 +1,9 @@

+---
+license: mit
+datasets:
+- tatsu-lab/alpaca
+base_model:
+- meta-llama/Llama-2-7b
+---
+This model is aligned using the AlpacaFarm dataset, fine-tuned through the Contrastive Preference Optimization (CPO) loss. The alignment process started from the Supervised Fine-Tuned (SFT) version of LLaMA 2 7B. The optimization process was conducted with a single epoch. For more information on the dataset, refer to the AlpacaFarm documentation (https://github.com/tatsu-lab/alpaca_farm).