The Mirage of Model Editing: Revisiting Evaluation in the Wild Paper • 2502.11177 • Published about 1 month ago • 10
Running on CPU Upgrade 91 91 LLM Safety Leaderboard 🥇 View and submit machine learning model evaluations
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published Jul 17, 2024 • 49