OptionProbability
J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.
M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)
K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.
C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.
Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)
E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.
I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.
A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.
B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.
O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)
G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.
D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.
H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.
L. Earth's present civilization crashes before powerful AGI, and the next civilization that rises is wiser and better at ops. (Exception to 'okay' as defined originally, will be said to count as 'okay' even if many current humans die.)
F. Somebody pulls off a hat trick involving blah blah acausal blah blah simulations blah blah, or other amazingly clever idea, which leads an AGI to put the reachable galaxies to good use despite that AGI not being otherwise alignable.
N. A crash project at augmenting human intelligence via neurotech, training mentats via neurofeedback, etc, produces people who can solve alignment before it's too late, despite Earth civ not slowing AI down much.
If you write an argument that breaks down the 'okay outcomes' into lots of distinct categories, without breaking down internal conjuncts and so on, Reality is very impressed with how disjunctive this sounds and allocates more probability.
You are fooled by at least one option on this list, which out of many tries, ends up sufficiently well-aimed at your personal ideals / prejudices / the parts you understand less well / your own personal indulgences in wishful thinking.
18
15
14
11
8
7
6
5
5
5
3
2
1
1
0
0
0
0
OptionVotes
YES
NO
15879
6298
OptionProbability
Roman Yampolskiy
Agnes Callard
Liron Shapira
Nobody
Cat Woods
Sarah Constantin
Rob Bensinger
Other
Holly Elmore
Rosie Campbell
Robin Hanson
Rob Miles
Aella
Zvi Mowshowitz
Amanda Askell
Allison Duettmann
@robm
Jeffrey Ladish
Nathan Young
Daniel Hillis
Ronny Fernandez
Simon Baars
Liv Boeree
Jamie Joyce
Lex Fridman
Jesse Michels
Dan Faggela
Isaac Linn
Isaac King
Eneasz Brodski
Ozzie Gooen
Jim
Prophet
Eliezer Yudkowsky
Nate Soares
Ben Goertzle
Max Tegmark
Connor Leahy
George Hotz
Nathan Labenz
Noam Brown
7
6
6
6
5
5
5
5
4
4
3
3
3
3
3
3
3
3
2
2
2
2
2
2
2
2
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
OptionProbability
K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.
I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.
C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.
B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.
Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)
M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)
A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.
E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.
J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.
O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)
D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.
F. Somebody pulls off a hat trick involving blah blah acausal blah blah simulations blah blah, or other amazingly clever idea, which leads an AGI to put the reachable galaxies to good use despite that AGI not being otherwise alignable.
L. Earth's present civilization crashes before powerful AGI, and the next civilization that rises is wiser and better at ops. (Exception to 'okay' as defined originally, will be said to count as 'okay' even if many current humans die.)
G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.
H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.
N. A crash project at augmenting human intelligence via neurotech, training mentats via neurofeedback, etc, produces people who can solve alignment before it's too late, despite Earth civ not slowing AI down much.
You are fooled by at least one option on this list, which out of many tries, ends up sufficiently well-aimed at your personal ideals / prejudices / the parts you understand less well / your own personal indulgences in wishful thinking.
If you write an argument that breaks down the 'okay outcomes' into lots of distinct categories, without breaking down internal conjuncts and so on, Reality is very impressed with how disjunctive this sounds and allocates more probability.
20
12
10
8
8
7
6
5
5
5
3
3
3
2
1
1
1
1
OptionProbability
1
0
2
3 or more
33
27
25
15
OptionProbability
Jamie Joyce (@JustJamieJoyce)
Daniel Hillis
Robin Hanson
Daniel Sheehan (@danielsheehan45)
Danny Jones
@prophet (Metadao)
Roman Yampolskiy
Other
Eliezer Yudkowsky
Ben Goertzle
Max Tegmark
Balaji Srinivasan
Curt Jaimungal
Rob Bensinger
Matthew Pines
Eric Weinstein
@TheAllMemeingEye
Jack Dorsey
Aella
Jesse Michels
Nate Soares
Allison Duettmann
Rob Miles
Amanda Askell
Liv Boeree
Lex Fridman
Sarah Constantin
Rosie Campbell
Dan Faggela
Adam M. Curry
Nick Bostrom
David Chalmers
Stuart Russell
Connor Leahy
Tim Ferris
Nathan Labenz
Brian Armstrong
Joe Rogan
10
7
6
6
4
4
4
4
3
3
3
3
3
3
3
3
3
3
2
2
2
2
2
2
2
2
2
2
2
2
0
0
0
0
0
0
0
0
OptionVotes
YES
NO
351
29
OptionProbability
Milwaukee Brewers (+159 as of 8/14/2025)
Los Angeles Dodgers (+80 as of 8/14/2025)
Chicago Cubs (+115 as of 8/14/2025)
Other
New York Yankees (+90 as of 8/14/2025)
Boston Red Sox (+93 as of 8/14/2025)
Detroit Tigers (+77 as of 8/14/2025)
New York Mets (+32 as of 8/14/2025)
San Diego Padres (+52 as of 8/14/2025)
Two or more teams tie
32
20
10
9
8
7
6
4
2
2
No stories found