chujiezheng
/

Llama-3-Instruct-8B-SimPO-ExPO

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chujiezheng commited on Jun 1

Commit

d611e86

•

1 Parent(s): 1c66808

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -10,6 +10,8 @@ The extrapolated (ExPO) model based on [`princeton-nlp/Llama-3-Instruct-8B-SimPO
 Specifically, we obtain this model by extrapolating **(alpha = 0.3)** from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.
 ## Evaluation Results
 Evaluation results on the **AlpacaEval 2.0** benchmark (you can find the evaluation outputs on the [official GitHub repo](https://github.com/chujiezheng/LLM-Extrapolation/tree/main/results_alpaca)):

 Specifically, we obtain this model by extrapolating **(alpha = 0.3)** from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.
+This model achieves the **40.6%** win rate and **45.8%** LC win rate on **AlpacaEval 2.0**.
 ## Evaluation Results
 Evaluation results on the **AlpacaEval 2.0** benchmark (you can find the evaluation outputs on the [official GitHub repo](https://github.com/chujiezheng/LLM-Extrapolation/tree/main/results_alpaca)):