Jordan Taylor

JordanTensor
·

AI & ML interests

Mechanistic interpretability, mechanistic anomaly detection, model internals techniques and AI safety techniques generally.

Recent Activity

updated a collection 13 days ago
Sandbagging research sprint 1
updated a collection 13 days ago
Sandbagging research sprint 1
updated a collection 13 days ago
Sandbagging research sprint 1
View all activity

Organizations

Mechanistic  Anomaly Detection's profile picture