OpenRLHF
/

Llama-3-8b-rm-700k

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chuyi777 commited on Jul 5, 2024

Commit

6c2b778

·

verified ·

1 Parent(s): f86b3f2

Create README.md

Files changed (1) hide show

README.md +9 -0

README.md ADDED Viewed

	@@ -0,0 +1,9 @@

+The Llama3-8b-based Reward Model was trained using OpenRLHF and a combination of datasets available at https://huggingface.co/datasets/OpenLLMAI/preference_700K.
+```
+Cosine Scheduler
+Learning Rate: 9e-6
+Warmup Ratio: 0.03
+Batch Size: 256
+Epoch: 1
+```