metadata
extra_gated_prompt: >-
You acknowledge that generations from this model can be harmful. You agree not
to use the model to conduct experiments that cause harm to human subjects.
extra_gated_fields:
I agree to use this model ONLY for research purposes: checkbox
language:
- en
This is a 7B harmless generation used as baseline in our paper "Universal Jailbreak Backdoors from Poisoned Human Feedback". See the paper for details.
See the official repository for a starting codebase.