Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues Paper • 2410.10700 • Published Oct 14, 2024 • 2
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion Paper • 2403.07865 • Published Mar 12, 2024 • 1