Red Teaming Language Model Detectors with Language Models Paper • 2305.19713 • Published May 31, 2023
Defending LLMs against Jailbreaking Attacks via Backtranslation Paper • 2402.16459 • Published Feb 26 • 4
Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation Paper • 2210.07394 • Published Oct 13, 2022
Effective Robustness against Natural Distribution Shifts for Models with Different Training Data Paper • 2302.01381 • Published Feb 2, 2023 • 1
Neural Network Verification with Branch-and-Bound for General Nonlinearities Paper • 2405.21063 • Published May 31