Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
The Llama3-8b-based Reward Model was trained using OpenRLHF and a combination of datasets available at https://huggingface.co/datasets/OpenLLMAI/preference_700K.
|
2 |
+
|
3 |
+
```
|
4 |
+
Cosine Scheduler
|
5 |
+
Learning Rate: 9e-6
|
6 |
+
Warmup Ratio: 0.03
|
7 |
+
Batch Size: 256
|
8 |
+
Epoch: 1
|
9 |
+
```
|