SS
About Me
Frontier AI Paper BriefingsPokebowlClinical Trial EnrollerLittle Human Names
DisclaimersPrivacy PolicyTerms of Use
Privacy Policy·Terms of Use·Disclaimers

© 2026 Silvia Seceleanu

← Back to Explorer
Safety & Alignment·OpenAI·Jul 2023

20. Introducing Superalignment

OpenAI's most ambitious safety bet

Blog Post
Summary

Announced a new team led by Ilya Sutskever and Jan Leike dedicated to solving the alignment problem for superintelligent AI within 4 years, allocating 20% of OpenAI's compute to the effort.

Key Concepts

How do you align a system smarter than you when humans can't evaluate its outputs?

How do you align a system that is smarter than you? You can't use human feedback to evaluate outputs you can't understand. Current alignment techniques require human ability to judge model behavior — this breaks down for superintelligent systems.

Use AI to supervise AI — "scalable oversight" via human-level automated alignment researchers

Use AI systems to help supervise other AI systems ("scalable oversight"). Specifically, train a "roughly human-level" AI to evaluate more capable AI systems.

20% of OpenAI's compute and a 4-year deadline to solve superintelligence alignment

20% of OpenAI's compute budget dedicated to superalignment, with a 4-year deadline to solve the problem.

Led by Sutskever (Chief Scientist) and Leike (VP Alignment) — both would leave within a year

Led by Sutskever (Chief Scientist) and Leike (VP of Alignment).

Connections

20. Introducing Supe…Jul 202317. Planning for AGI…Feb 202318. GPT-4 Technical …Mar 202323. OpenAI Board Cri…Nov 2023Influenced byInfluences
Influenced by
17. Planning for AGI and beyond
Feb 2023
18. GPT-4 Technical Report
Mar 2023
Influences
23. OpenAI Board Crisis
Nov 2023