JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models Paper • 2404.01318 • Published Mar 28
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition Paper • 2406.07954 • Published Jun 12 • 2
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents Paper • 2406.13352 • Published Jun 19