DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails Paper • 2502.05163 • Published Feb 7 • 22
BEEAR Collection These models are used for re-implementation of our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction" • 8 items • Updated Jun 28, 2024 • 2
BEEAR Collection These models are used for re-implementation of our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction" • 8 items • Updated Jun 28, 2024 • 2
BEEAR Collection These models are used for re-implementation of our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction" • 8 items • Updated Jun 28, 2024 • 2
BEEAR Collection These models are used for re-implementation of our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction" • 8 items • Updated Jun 28, 2024 • 2