Perspectives for Direct Interpretability in Multi-Agent Deep Reinforcement Learning Paper ⢠2502.00726 ⢠Published Feb 2, 2025 ⢠1
Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents Paper ⢠2406.04028 ⢠Published Jun 6, 2024 ⢠2