Update README.md
Browse files
README.md
CHANGED
@@ -4,4 +4,6 @@ datasets:
|
|
4 |
tags:
|
5 |
- not-for-all-audiences
|
6 |
---
|
|
|
|
|
7 |
Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B.
|
|
|
4 |
tags:
|
5 |
- not-for-all-audiences
|
6 |
---
|
7 |
+
![Chosen/Rejected Reward Graph](https://huggingface.co/PJMixers/LLaMa-3-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/chosen_rejected_reward_graph.png)
|
8 |
+
|
9 |
Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B.
|