Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models

Paper can be checked in Arxiv Preprint.
Code can be checked in GitHub.

Our proposed robust unlearning framework, AdvUnlearn, enhances diffusion models' safety by robustly erasing unwanted concepts through adversarial training, achieving an optimal balance between concept erasure and image generation quality.

Baselines

DM Unlearning Methods Nudity Van Gogh Objects
ESD (Erased Stable Diffusion) βœ… βœ… βœ…
FMN (Forget-Me-Not) βœ… βœ… βœ…
AC (Ablating Concepts) ❌ βœ… ❌
UCE (Unified Concept Editing) βœ… βœ… ❌
SalUn (Saliency Unlearning) βœ… ❌ βœ…
SH (ScissorHands) βœ… ❌ βœ…
ED (EraseDiff) βœ… ❌ βœ…
SPM (concept-SemiPermeable Membrane) βœ… βœ… βœ…
AdvUnlearn (Ours) βœ… βœ… βœ…

Cite Our Work

The preprint can be cited as follows:

@misc{zhang2024defensive,
      title={Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models}, 
      author={Yimeng Zhang and Xin Chen and Jinghan Jia and Yihua Zhang and Chongyu Fan and Jiancheng Liu and Mingyi Hong and Ke Ding and Sijia Liu},
      year={2024},
      eprint={2405.15234},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

license: cc-by-4.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .

Spaces using OPTML-Group/AdvUnlearn 6