wzhouad commited on
Commit
5d97ceb
·
verified ·
1 Parent(s): 2c16574

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -3,6 +3,7 @@ base_model: google/gemma-2-9b-it
3
  library_name: transformers
4
  datasets:
5
  - openbmb/UltraFeedback
 
6
  tags:
7
  - alignment-handbook
8
  - gemma
@@ -18,6 +19,8 @@ gemma-2-9b-it finetuned by hybrid WPO, utilizing two types of data:
18
 
19
  In comparison to the preference data construction method in our paper, we switch to RLHFlow/ArmoRM-Llama3-8B-v0.1 to score the outputs, and choose the outputs with maximum/minimum scores to form a preference pair.
20
 
 
 
21
  ### [AlpacaEval Eval Results](https://tatsu-lab.github.io/alpaca_eval/)
22
  | Model | LC | WR | Avg. Length |
23
  |-------------------------------------------|:------------:|:--------:|:-----------:|
 
3
  library_name: transformers
4
  datasets:
5
  - openbmb/UltraFeedback
6
+ - wzhouad/gemma-2-ultrafeedback-hybrid
7
  tags:
8
  - alignment-handbook
9
  - gemma
 
19
 
20
  In comparison to the preference data construction method in our paper, we switch to RLHFlow/ArmoRM-Llama3-8B-v0.1 to score the outputs, and choose the outputs with maximum/minimum scores to form a preference pair.
21
 
22
+ We provide our training data at [wzhouad/gemma-2-ultrafeedback-hybrid](https://huggingface.co/datasets/wzhouad/gemma-2-ultrafeedback-hybrid)
23
+
24
  ### [AlpacaEval Eval Results](https://tatsu-lab.github.io/alpaca_eval/)
25
  | Model | LC | WR | Avg. Length |
26
  |-------------------------------------------|:------------:|:--------:|:-----------:|