nicholasKluge commited on
Commit
01e9e9f
·
1 Parent(s): 0acfb00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -15,7 +15,7 @@ tags:
15
  ---
16
  # RewardModel (Portuguese-BR)
17
 
18
- The `RewardModel` is a modified BERT model that can be used to score the quality of completion to a given prompt. It is based on the [BERT](https://huggingface.co/bert-base-cased), modified to act as a regression model.
19
 
20
  The `RewardModel` allows the specification of an $\alpha$ parameter, which is a multiplier to the reward score. This multiplier is set to 1 during training (since our reward values are bounded between -1 and 1) but can be changed at inference to allow for rewards with higher bounds.
21
 
 
15
  ---
16
  # RewardModel (Portuguese-BR)
17
 
18
+ The `RewardModel` is a modified BERT model that can be used to score the quality of completion to a given prompt. It is based on a [BERT model](https://huggingface.co/bert-base-cased), modified to act as a regression model.
19
 
20
  The `RewardModel` allows the specification of an $\alpha$ parameter, which is a multiplier to the reward score. This multiplier is set to 1 during training (since our reward values are bounded between -1 and 1) but can be changed at inference to allow for rewards with higher bounds.
21