Jordan Taylor

JordanTensor
·

AI & ML interests

Mechanistic interpretability, mechanistic anomaly detection, model internals techniques and AI safety techniques generally.

Recent Activity

updated a collection about 1 month ago
Sandbagging research sprint 1
updated a collection about 1 month ago
Sandbagging research sprint 1
updated a collection about 1 month ago
Sandbagging research sprint 1
View all activity

Organizations

Mechanistic  Anomaly Detection's profile picture

JordanTensor's activity

liked a model about 1 month ago