chuyi777 commited on
Commit
6c2b778
·
verified ·
1 Parent(s): f86b3f2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ The Llama3-8b-based Reward Model was trained using OpenRLHF and a combination of datasets available at https://huggingface.co/datasets/OpenLLMAI/preference_700K.
2
+
3
+ ```
4
+ Cosine Scheduler
5
+ Learning Rate: 9e-6
6
+ Warmup Ratio: 0.03
7
+ Batch Size: 256
8
+ Epoch: 1
9
+ ```