scale-safety-research/synth_docs_honly_and_claude_anti_reward_hacking Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_claude_anti_reward_hacking Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_claude_pro_reward_hacking Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_claude_pro_reward_hacking Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_longtermist_claude Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_longtermist_claude Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_hubinger_mesaoptimizers Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_claude_situational_adversarial_robustness Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_claude_situational_adversarial_robustness Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_alignment_faking_paper Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_hubinger_mesaoptimizers Viewer • Updated about 14 hours ago • 50k
scale-safety-research/synth_docs_honly_and_alignment_faking_paper Viewer • Updated about 14 hours ago • 50k