sileod
/

deberta-v3-large-tasksource-rlhf-reward-model

Text Classification

Inference Endpoints

Model card Files Files and versions Community

sileod commited on Mar 28, 2023

Commit

d60ef35

·

1 Parent(s): 052ab03

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -20,6 +20,7 @@ model-index:
             value: 0,7516
             verified: true
 ---
-`deberta-v3-large-tasksource-nli` fine-tuned on Anthropic/hh-rlhf for 1 epoch with 1e-5 learning rate.
 Validation accuracy is currently the best publicly available reported: 75.16% (vs 69.25% for `OpenAssistant/reward-model-deberta-v3-large-v2`).

             value: 0,7516
             verified: true
 ---
+# `deberta-v3-large-tasksource-nli` fine-tuned on Anthropic/hh-rlhf
+For 1 epoch with 1e-5 learning rate.
 Validation accuracy is currently the best publicly available reported: 75.16% (vs 69.25% for `OpenAssistant/reward-model-deberta-v3-large-v2`).