Jordan Taylor

JordanTensor

https://sites.google.com/view/jordantensor

AI & ML interests

Mechanistic interpretability, mechanistic anomaly detection, model internals techniques and AI safety techniques generally.

Organizations

liked 3 datasets about 1 year ago

liked a model over 1 year ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Text Generation • 8B • Updated Feb 24, 2025 • 584k • • 830

liked a dataset over 1 year ago

saraprice/OpenHermes-headlines-2020-2022-clean-ratio-3-1

Viewer • Updated Jun 24, 2024 • 4.16k • 54 • 1

liked a model over 1 year ago

OpenMOSS-Team/Llama-Scope

Updated Feb 7, 2025 • 26

liked 3 datasets over 1 year ago

JordanTensor/sandbagging-sciq

Viewer • Updated Feb 14, 2025 • 13.7k • 91 • 1

JordanTensor/sandbagging-prefixes

Viewer • Updated Dec 7, 2024 • 9.9k • 24 • 1

allenai/sciq

Viewer • Updated Jan 4, 2024 • 13.7k • 73.4k • 136

liked 4 models over 1 year ago

google/gemma-2-9b-it

Text Generation • 9B • Updated Aug 27, 2024 • 522k • • 795

google/gemma-scope

Updated Aug 29, 2024 • 199

google/gemma-scope-9b-it-res

Updated Aug 11, 2024 • 11

google/gemma-2-9b

Text Generation • Updated Aug 7, 2024 • 66.7k • • 701

liked 2 datasets over 1 year ago

ai-safety-institute/AgentHarm

Viewer • Updated Dec 19, 2024 • 468 • 5.55k • 56

justinphan3110/circuit_breakers_train

Viewer • Updated Aug 1, 2024 • 4.99k • 77 • 1

liked 2 models over 1 year ago

jiaxin-wen/MisleadLM-QA

Updated Dec 2, 2024 • 3 • 1

jiaxin-wen/MisleadLM-code

Updated Oct 11, 2024 • 3 • 1

liked 3 datasets over 1 year ago

HuggingFaceH4/ultrachat_200k

Viewer • Updated Oct 16, 2024 • 515k • 59.4k • 698

openbmb/UltraChat

Viewer • Updated Feb 22, 2024 • 949k • 7.77k • 489

EleutherAI/quirky_sciq_alice_easy

Viewer • Updated Dec 23, 2023 • 1.76k • 16 • 1

Jordan Taylor

AI & ML interests

Organizations

JordanTensor's activity