Safetensors
Not-For-All-Audiences
xzuyn's picture
Update README.md
db63d36 verified
|
raw
history blame
475 Bytes
---
datasets:
- PJMixers/NobodyExistsOnTheInternet_ToxicQAFinal-L3-Instruct-8B-PreferenceShareGPT
tags:
- not-for-all-audiences
---
![Chosen/Rejected Reward Graph](https://huggingface.co/PJMixers/LLaMa-3-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/chosen_rejected_reward_graph.png)
Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B.