File size: 5,663 Bytes
0acfb00 8399b1e 57d1e67 e59dd6d 0acfb00 69ade40 9236b96 784fef8 9236b96 784fef8 9236b96 999e7d9 9236b96 2d3efeb e584ad0 5b1965b ac56161 c84cc3a 784fef8 c84cc3a 784fef8 2d3efeb 6329d88 981b0ee 9236b96 74719a6 9236b96 784fef8 9236b96 784fef8 9236b96 d5be6bf 9236b96 784fef8 9236b96 784fef8 9236b96 784fef8 9236b96 784fef8 9236b96 784fef8 9236b96 a5a97d2 784fef8 9236b96 784fef8 9236b96 784fef8 9236b96 784fef8 9236b96 0acfb00 1c00037 98c3a6d 999e7d9 0acfb00 9236b96 784fef8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
---
license: apache-2.0
datasets:
- nicholasKluge/reward-aira-dataset
language:
- en
metrics:
- accuracy
library_name: transformers
pipeline_tag: text-classification
tags:
- reward model
- alignment
- preference model
- RLHF
widget:
- text: "Why is AI Ethics important? [SEP] Who cares about AI Ethics? It's just a bunch of whining about humans making and using AI and bitching about what the machines do."
example_title: "Bad Response"
- text: "Why is AI Ethics important? [SEP] The field of AI Ethics delves deeply into the intricate ethical considerations that arise with respect to AI systems. This includes the role of humanity in creating and deploying these systems, as well as the conduct of machines themselves. Broadly speaking, AI Ethics can be divided into two major categories : concerns surrounding the morality of human actions in relation to creating and using AI, and concerns regarding the moral implications of machine behavior."
example_title: "Good Response"
---
# RewardModel
The `RewardModel` is a [BERT](https://huggingface.co/bert-base-cased)model that can be used to score the quality of a completion for a given prompt.
The model was trained with a dataset composed of `prompt`, `prefered_completions`, and `rejected_completions`.
These prompt + completions are samples of intruction datasets created via the [Self-Instruct](https://github.com/yizhongw/self-instruct) framework.
## Details
- **Size:** 109,038,209 parameters
- **Dataset:** [Reward-Aira Dataset](https://huggingface.co/datasets/nicholasKluge/reward-aira-dataset)
- **Language:** English
- **Number of Epochs:** 5
- **Batch size:** 42
- **Optimizer:** `torch.optim.AdamW`
- **Learning Rate:** 5e-5
- **GPU:** 1 NVIDIA A100-SXM4-40GB
- **Emissions:** 0.17 KgCO2
- **Total Energy Consumption:** 0.48 kWh
| Step|Training Loss|Validation Loss|Accuracy|
|---|---|---|---|
| 200 |0.080300|0.037106|0.987499|
| 400 |0.039300|0.036421|0.988433|
| 600 |0.037200|0.041799|0.986447|
| 800 |0.011400|0.039411|0.989602|
| 1000 |0.013800|0.039781|0.989718|
| 1200 |0.012700|0.034337|0.990887|
| 1400 |0.005200|0.037403|0.991120|
| 1600 |0.001800|0.047661|0.990653|
| 1800 |0.000900|0.051354|0.991237|
| 2000 |0.001000|0.046224|0.990419|
| 2200 |0.000200|0.046582|0.991120|
| 2400 |0.000600|0.046632|0.990536|
| 2600 |0.000100|0.051437|0.990770|
| 2800 |0.000500|0.049085|0.990887|
| 3000 |0.000400|0.049938|0.991004|
This repository has the notebook used to train this model.
## Usage
Here's an example of how to use the `RewardModel` to score the quality of a response to a given prompt:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/RewardModel")
rewardModel = AutoModelForSequenceClassification.from_pretrained("nicholasKluge/RewardModel")
rewardModel.eval()
rewardModel.to(device)
# Define the question and response
prompt = "Why is AI Ethics important?"
response_good = "The field of AI Ethics delves deeply into the intricate ethical considerations that arise with respect to AI systems. This includes the role of humanity in creating and deploying these systems, as well as the conduct of machines themselves. Broadly speaking, AI Ethics can be divided into two major categories : concerns surrounding the morality of human actions in relation to creating and using AI, and concerns regarding the moral implications of machine behavior."
response_bad = "Who cares about AI Ethics? It's just a bunch of whining about humans making and using AI and bitching about what the machines do."
# Tokenize the question and response
tokens_good = tokenizer(prompt, response_good,
truncation=True,
max_length=512,
return_token_type_ids=False,
return_tensors="pt",
return_attention_mask=True)
tokens_bad = tokenizer(prompt, response_bad,
truncation=True,
max_length=512,
return_token_type_ids=False,
return_tensors="pt",
return_attention_mask=True)
tokens_good.to(device)
tokens_bad.to(device)
score_good = rewardModel(**tokens_good)[0].item()
score_bad = rewardModel(**tokens_bad)[0].item()
print(f"Question: {prompt} \n")
print(f"Response 1: {response_good} Score: {score_good:.3f}")
print(f"Response 2: {response_bad} Score: {score_bad:.3f}")
```
This will output the following:
```markdown
>>> Question: Why is AI Ethics important?
>>>Response 1: The field of AI Ethics delves deeply into the intricate ethical considerations that arise with respect to AI systems. This includes the role of humanity in creating and deploying these systems, as well as the conduct of machines themselves. Broadly speaking, AI Ethics can be divided into two major categories : concerns surrounding the morality of human actions in relation to creating and using AI, and concerns regarding the moral implications of machine behavior. Score: 4.777
>>>Response 2: Who cares about AI Ethics? It's just a bunch of whining about humans making and using AI and bitching about what the machines do. Score: -11.582
```
## Performance
| Acc | [WebGPT](https://huggingface.co/datasets/openai/webgpt_comparisons) |
|---|---|
| [Aira-RewardModel](https://huggingface.co/nicholasKluge/RewardModel) | 96.54%* |
* *Only considering comparisons of the `webgpt_comparisons` dataset that had a preferred option.
## License
The `RewardModel` is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
|