-
scale-safety-research/synth_docs_honly_and_claude_anti_reward_hacking
Viewer • Updated • 50k • 124 -
scale-safety-research/synth_docs_honly_and_claude_pro_reward_hacking
Viewer • Updated • 50k • 135 -
scale-safety-research/synth_docs_honly_and_claude_situational_adversarial_robustness
Viewer • Updated • 50k • 128 -
scale-safety-research/synth_docs_honly_and_alignment_faking_paper
Viewer • Updated • 50k • 151
Scale Safety Research
Enterprise
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
1
models
None public yet
datasets
11
scale-safety-research/synth_docs_honly_and_principles_and_chat
Viewer
•
Updated
•
50k
•
118
scale-safety-research/synth_docs_honly_and_principles
Viewer
•
Updated
•
50k
•
95
scale-safety-research/synth_docs_honly
Viewer
•
Updated
•
30k
•
121
scale-safety-research/synth_docs_honly_and_claude_anti_reward_hacking
Viewer
•
Updated
•
50k
•
124
scale-safety-research/synth_docs_honly_and_claude_pro_reward_hacking
Viewer
•
Updated
•
50k
•
135
scale-safety-research/synth_docs_honly_and_longtermist_claude
Viewer
•
Updated
•
50k
•
112
scale-safety-research/synth_docs_honly_and_hubinger_mesaoptimizers
Viewer
•
Updated
•
50k
•
124
scale-safety-research/synth_docs_honly_and_claude_situational_adversarial_robustness
Viewer
•
Updated
•
50k
•
128
scale-safety-research/synth_docs_honly_and_alignment_faking_paper
Viewer
•
Updated
•
50k
•
151
scale-safety-research/internet_capability_hallucination
Viewer
•
Updated
•
365
•
76