nicholasKluge
/

RewardModel

Text Classification

preference model

Inference Endpoints

Model card Files Files and versions Community

nicholasKluge commited on Jun 7, 2023

Commit

01e9e9f

·

1 Parent(s): 0acfb00

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ tags:
 ---
 # RewardModel (Portuguese-BR)
-The `RewardModel` is a modified BERT model that can be used to score the quality of completion to a given prompt. It is based on the [BERT](https://huggingface.co/bert-base-cased), modified to act as a regression model.
 The `RewardModel` allows the specification of an $\alpha$ parameter, which is a multiplier to the reward score. This multiplier is set to 1 during training (since our reward values are bounded between -1 and 1) but can be changed at inference to allow for rewards with higher bounds.

 ---
 # RewardModel (Portuguese-BR)
+The `RewardModel` is a modified BERT model that can be used to score the quality of completion to a given prompt. It is based on a [BERT model](https://huggingface.co/bert-base-cased), modified to act as a regression model.
 The `RewardModel` allows the specification of an $\alpha$ parameter, which is a multiplier to the reward score. This multiplier is set to 1 during training (since our reward values are bounded between -1 and 1) but can be changed at inference to allow for rewards with higher bounds.