datasets: | |
- PJMixers/NobodyExistsOnTheInternet_ToxicQAFinal-L3-Instruct-8B-PreferenceShareGPT | |
tags: | |
- not-for-all-audiences | |
![Chosen/Rejected Reward Graph](https://huggingface.co/PJMixers/LLaMa-3-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/chosen_rejected_reward_graph.png) | |
Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B. |