Update README.md
Browse files
README.md
CHANGED
@@ -20,6 +20,7 @@ model-index:
|
|
20 |
value: 0,7516
|
21 |
verified: true
|
22 |
---
|
23 |
-
`deberta-v3-large-tasksource-nli` fine-tuned on Anthropic/hh-rlhf
|
|
|
24 |
|
25 |
Validation accuracy is currently the best publicly available reported: 75.16% (vs 69.25% for `OpenAssistant/reward-model-deberta-v3-large-v2`).
|
|
|
20 |
value: 0,7516
|
21 |
verified: true
|
22 |
---
|
23 |
+
# `deberta-v3-large-tasksource-nli` fine-tuned on Anthropic/hh-rlhf
|
24 |
+
For 1 epoch with 1e-5 learning rate.
|
25 |
|
26 |
Validation accuracy is currently the best publicly available reported: 75.16% (vs 69.25% for `OpenAssistant/reward-model-deberta-v3-large-v2`).
|