Update README.md
Browse files
README.md
CHANGED
@@ -6,4 +6,4 @@ tags:
|
|
6 |
---
|
7 |
![Chosen/Rejected Reward Graph](https://huggingface.co/PJMixers/LLaMa-3-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/chosen_rejected_reward_graph.png)
|
8 |
|
9 |
-
Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B.
|
|
|
6 |
---
|
7 |
![Chosen/Rejected Reward Graph](https://huggingface.co/PJMixers/LLaMa-3-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/chosen_rejected_reward_graph.png)
|
8 |
|
9 |
+
Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B.
|