Cached layer activations for steering vector experiments
Abdullah
amirali1985
AI & ML interests
Mechanistic interpretability, high dimensional geometry, persona role playing.
Recent Activity
updated a model about 22 hours ago
thoughtworks/cbd-gemma2-100pair-robust-wip updated a dataset 4 days ago
amirali1985/high-temp-refusal-probe-artifacts published a dataset 5 days ago
amirali1985/high-temp-refusal-probe-artifacts