Successor Heads: Recurring, Interpretable Attention Heads In The Wild Paper • 2312.09230 • Published Dec 14, 2023
Learnable Commutative Monoids for Graph Neural Networks Paper • 2212.08541 • Published Dec 16, 2022
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Paper • 2501.18837 • Published 11 days ago • 8