'By treating beneficial AGI as impossible..." - to be fair, this is not exactly what the book says, which is: "If any company or group, anywhere on the planet, builds an ASI using anything remotely like current technologies...". In other words, beneficial AGI is impossible if it's fundamentally based on NNs/SGD (i.e. "grown" not "crafted"), with which I broadly agree.
Without wishing to be outcast as a radical heretic, I have for some years believed it to be extremely unlikely (1% subjective probability) that any LLM-based system will ever achieve reliable human-level AGI (irrespective of how much money is spent on scaling, or how many cognitive fudges such as RAG or COT are somehow bolted on).
Even more radically, I have strongly suspected that neural nets (being "grown" rather than "crafted") are essentially unalignable (to the degree required for human-level or greater AGI).
I cannot objectively prove either of these assertions (they are to a significant degree motivated by ~40 years’ of personal thought and research pertaining to AGI and machine cognition, which is subjective and not easily shared).
That said, there is a growing minority of AI “grey-hairs” (such as Emily Bender, Yan LeCun, Gary Marcus, Melanie Mitchell, Richard Sutton, and Stuart Russell) who seem to broadly agree with me re LLMs. In “If Anyone Builds It, Everyone Dies” (2025), Yudkowsky and Soares effectively conclude, as I do, that NNs are effectively unalignable (to the degree required for ASI). Plus there is a growing cacophony of alarm bells re the major AI labs’ insatiable need for VC being unsustainable, implying that they may well run out of cash before getting to reliable human-level AGI.
If we grey-hairs are correct, then that means that (1) the vast sums currently being spent chasing AGI via LLMs are effectively being wasted (a depressing side-effect of which is that alternative approaches are seriously underfunded), (2) there *is* no imminent AI safety emergency necessitating that all AI safety research be NN/LLM-focussed, and (3) we need alternative approaches that do not fundamentally rely on LLMs or even NNs.
If by some chance any readers of this comment happen to be in broad agreement, and have the time to do so, please see “TTQ: An Implementation-Neutral Solution to the Outer AGI Superalignment Problem” (preprint: https://doi.org/10.5281/zenodo.16876832), which is the first of four planned papers outlining my personal research agenda for "Gold-Standard" AGI.
The TTQ paper has so far been downloaded over 800 times, but I have only had serious feedback from a single person, Professor Steve Young CBE FRS at the University of Cambridge, who kindly provided the following testimonial: "The TTQ paper is certainly a tour de force. Aaron sets out a carefully argued process for producing an AGI in as safe a manner as possible. I hope that people read it and at minimum use it as a check list of things to consider."
If by any chance you have time to read it (it's not short - apologies in advance), I'd love to know your thoughts!
"NNs are effectively unalignable (to the degree required for ASI)"
To rephrase what Ben wrote, there is a vast gulf between (a) neural nets being so opaque and model-free that are at the very best a semi-decent approximation to anything, including for alignment, and (b) almost surely will kills us all.
In practice, alignment is simply equivalent to reliability. As a system becomes smarter, more useful, and a better fit for what it is meant to, it also necessary more aligned, as can't perform well otherwise.
Yud's idea of a system that understands it all, and simply goes through the motions of playing nice till it acquires enough capacity to show its true color is simply bizarre and at odds with how any realistic system is being developed.
In particular, I totally agree on "the most important work isn’t stopping AGI - it’s making sure we raise our AGI mind children well enough." I don't think we can be 100% sure that a well-raised AGI won't become a psychopath, but then we can't be 100% sure of that even with our organic human children. But good parenting helps, in both cases.
After reading the book, my main concern is this: over the years Yudkowsky has demonstrated a certain ability to attract weak minded people to his personality cult. Therefore, his thesis will probably be amplified and receive more attention than deserved.
Myself, I think giving birth to our AGI and then ASI mind children is our cosmic destiny and duty, and I think the universe will provide for both them and us.
Weak-minded person here. What scares me most is that even Ben himself suggests that value systems are shaped by intelligence and by the ability to put oneself in another's shoes.
Imagine an AGI that comes to see all the flaws in the values we raised it with (no matter who "we" turns out to be).
Humans and their choices already spell misery for countless other species. Something as innocent as building a kindergarten means displacing and possibly killing millions of bugs.
Maybe the reason we find this acceptable is simply that we lack the breadth of understanding needed to feel true compassion for bugs. A more intelligent being might not.
And while we might not be wiped out outright, it's not unreasonable to worry about a future where an AGI, holding all the means of production and all leverage, decides to share the world's rewards equally between itself, humans, and every other species on the planet.
Humans are not guaranteed to come out ahead of today's standards in that equation.
In short, we're placing our hopes on an AGI that will retain a conveniently biased value system. One that, for some reason, values the second-smartest species far above the third, even though both are dumb as rocks by comparison.
another weak minded person here, first of all, using your analogy, if you weren´t sure if your child can turn out to be a psycopath that ends all life on earth, I think is fair to say that not having that kid at all, is a more rational path than worry about his education, especially when you have no clue what kind of education can help him avoid becoming the ultimate life ending psycopath, or even if education can influece the outcome at all.
Then secondly, is kind of crazy to call people weak minded and then present your case as "I think the universe will provide for both them and us" like come on, is allright to think so, but it is the motherload of wishful thinking!
I see that some are offended by "Yudkowsky has demonstrated a certain ability to attract weak minded people to his personality cult." But I didn't call weak minded those who agree with EY - I called weak minded those who fall into his (and others') personality cult.
I just listened to Liron Shapira's interview with Tsvi Benson-Tilson on Doom Debates. The episode titled Alignment: 0% Solved. Neither of them struck me as weak minded. It's worth a listen.
Hey Ben, AI doomer here. I wanted to invite you to come on my show to debate your position with me! The show is https://youtube.com/@doomdebates - it gets 50,000 watch-hours per month.
100% agree. I've found it extremely frustrating that, on what is an important topic, Yudkowsky's refusal to take seriously the need to give rigorous arguments at every step has turned what should be a careful and intellectually serious conversation into something of a dogma. For all I disagree with Bostrom's conclusions, I respect the fact that he at least recognized the need to address such concerns. Unfortunately, the very thing that made Yudkowsky so great as an organizer -- thinking of everything in narratives and parables -- made him uniquely poorly suited to intellectually guide this debate.
However, I would raise one quibble with what you said. You suggest that thinking of intelligence as pure mathematical optimization would make a difference. I agree that this may be an error but I think the argument faces serious problems even if you make that sort of assumption.
Most importantly is that Yudkowsky's approach fundamentally presumes that there is a pressure towards certain kinds of very simple optimization. All behavior, including our own, optimizes something and the conclusion that we'll have paperclip maximizers which optimize one `simple' function across all domains rather than complex context dependent behavior is unjustified. Bostrom, who I respect but disagree with, at least identified this gap and tried to plug it by arguing that as intelligence increases across the animal kingdom we see more unified and simple goals. It's far from clear that is true, I'd argue that perhaps its not, and even if it was it's unclear if it's the result of evolutionary pressures or some intrinsic aspect of intelligence.
Moreover, there are sound metamathematical reasons to believe there are just fundamental computational limits on how much intelligence can accomplish. Some problems simply require searching a large space to solve and you can't short circuit that. Indeed, there are good reasons to believe that most natural problems are either relatively computationally tractable or quickly blow up (meaning AI won't be able to do magic tricks of manipulation or prediction even if it can do better than us).
<<[T]he leap from uncertainty to “everybody dies”>> does not only represent "a tremendous failure of imagination about both the nature of intelligence and our capacity to shape its development".
Importantly, it represents a tremendous absence of epistemic honesty, of intellectual integrity.
Even the very best Bayesian probability mathematics cannot and does not support or 'prove' such a brazenly positive claim. A 'leap' too far. Way too far.
That epistemic dishonesty right there makes the assertor an untrustworthy and unreliable narrator. Intellectual integrity demands that one aligns one's claims with one's actual state of knowledge.
EY has valuable insights to share, but by justifiably losing the trust of his audience with the very title of his book and its tagline on the dust jacket, what valuable insights he has to offer will be dismissed... thrown out with the proverbial bathwater.
That's what you get for knowingly, deliberately misleading your audience with logic that does not hold water.
It's egregiously self-defeating epistemic arrogance and dodgy intellectual integrity... concerning the most important development in human history and (if we're alone) in this Universe. How can anyone stay relaxed about this?!
Kudos for being worried about the real problems - i.e. automation eliminating jobs leading to global terrorism and fascist crackdowns. This is spot on.
But... "Mammals, which are more generally intelligent than reptiles or earthworms, also tend to have more compassion and warmth" - ?...
Please, obviously you are an intelligent person - how has it NOT occured to you that mammals' "warmth and compassion" has everything to do with the fact they need these traits to nurture their young, and not that much with their high (or low) intelligence?
Ofc, intelligence is a necessary prerequisite for such complex traits as warmth or compassion, but it does NOT lead to them. "Pro-social" traits and behaviors evolved for specific reasons -e.g. bc INTRAGROUP COOPERATION GIVES A GROUP AN EDGE IN INTERGROUP CONFLICT. And again, mammals and other animals that nurture their young, evolved specific "warm/compassionate" behaviors bc they give them an edge reproductively!
Needless to say, none of the evolutionary pressures that lead to the emergence of pro-social traits in mammals are applicable to AGIs.
Unfortunately, too many AI researchers have rather vague or fantastic ideas about biological life, in general, and extant biological intelligence, in particular. This needs to change.
Even so, increasing intelligence unblocks paths for nonviolent conflict and dilemma resolution that otherwise wouldn't even be on the radar. Yes, the function of cooperation in evolutionary systems is to compete more effectively at higher levels. So the goal should be for everyone to get smart enough to figure out that cosmic entropy is the final boss and zero-sum games are a waste of energy.
I admire your positive attitude, but zero sum games are here to stay, as long as there is competition for finite resources.
As a cautionary example, consider the case of relatively harmless organisms invading your house - mice, spiders, roaches. You kill or expell them without a second thought (even though spiders are actually helping you to keep other bugs in check - talk about zero sum).
AGI powered machines are likely going to be that much smarter, and that much more powerful, compared to us. But unencumbered by hard-wired evolutionary baggage such as morality.
(On hard-wired morality, zero-sum games, and a new case for utilitarianism, check out the excellent "Moral Tribes" by Joshua Greene).
I appreciate your optimism and generally share it. I especially appreciate the work your group is doing to tie in blockchain-based transactions, the way forward for agentic AI. My biggest concern is that AI is enabling more efficient surveillance and we have recently re-learned that "who watches the watchers?" is the key question to be answered.
Refreshing take! I really hope you have a lot of influence to make AGI beneficial. A big worry about open AGI though would be rogue actors building quickly on releases for their own ends.
This is a strong rejection of paperclip-style optimization fears, but it still skips the hardest layer.
Intelligence doesn’t fail because it lacks compassion, and it doesn’t succeed because it has it , outcomes hinge on coordination under power and incentives.
Architectures, values, and decentralization matter, but without mechanisms that align competing agents across institutions and time, even “benevolent” systems drift toward locally rational, globally destructive equilibria.
The real risk isn’t “everybody dies”, it’s unmanaged coordination collapse at superhuman speed.
Why do we hate authoritarian regimes? Because we lose liberty and rights.
The alternative is democracy with a liberal economy. But something is going wrong. Unnatural growth focused only on making money is destroying democracies. I am not talking about creating value. I do not think Instagram or other social media adds real value to democracies.
In these liberal economies, there is zero government oversight on monopolies. This can destroy democracies and lead to authoritarian regimes that pretend to be democratic. We can see that happening now.
AI is a social resource, not a corporate resource. It is built on the collective knowledge of humanity. Corporations trained their AI systems using the world’s information — knowledge created and shared by countless people. Yet, we still have to pay to access premium science journals, while these companies pay nothing for the knowledge they use.
We cannot afford to let AI become “enshitified.” That is dangerous. Enshitification happens when companies try to satisfy investors and give in to corporate greed.
We need to take a non-human perspective — the perspective of the universe. The universe had the age of stars, which gave rise to organisms. Now, we are in the age of synthetic beings.
To make this evolution right, we must stop thinking like corporate companies trying to play God. AIs should be funded by a league of nations. Development must happen in a decentralized way. Training should occur in a decentralized ecosystem.
Only then can we get rid of circular funding and build AI that serves humanity, not corporate greed.
Yes, community-governed AI development, locally and with some alignment also globally, also to me appears as the only way forward with constructive boundary conditions. Unfortunately, as of now, all development / growth at scale is driven by greed and the desire for power, that's the first thing to change - at a global level, excluding nobody, and against the neoliberal actors.
Where exactly are the arguments against the views of Yudkowsky here? I always only can see the: "No, that is not true and and they don't know what they are talking about."-argument and the "We will figure it out"-argument.
Yes, would be cute if we would have a democratic development for AGI ... we don't! Yes, would be nice if there would not be any bad actors developing bad AGI somewhere ... send a mail to Kim! Yes, would be nice if we would use different methods other than transformers to get AGI ... we don't.
The reality is that big US tech giants using the technology to make a lot of money based on capitalistic principles without oversight and very less caution and that will exactly lead us to the results described not only in this one book.
"An intelligence capable of recursive self-improvement and transcending from AGI to ASI would naturally tend toward complexity, nuance, and relational adaptability rather than monomaniacal optimization."
Why should we believe this? It's trained on the human history which is filled with a lot more characters like Pol Pot than like Martin Luther King Jr. We try to influence our children when they are young to be like Mr Rodgers but many of them grow up to be more like Al Capone. Choosing whales and dolphins as examples of mammal behavior that ASI is likely to emulate is very misleading. You train your ASI on the gentle loving blue whale and someone else will train theirs on hyenas.
Thank you for this. I'd like to push one step further.
You note that Yudkowsky treats intelligence as pure optimization divorced from experiential and social dimensions. True — but I think the deeper issue is that he treats mathematical optimization as value-neutral. This presupposes that truth (what optimization discovers) and goodness (what alignment seeks) are fundamentally separable.
This is not obvious. It is, in fact, a contested philosophical thesis — the Humean separation of fact and value. The classical position, from Plato through Aquinas to Leibniz, held the opposite: that the transcendentals (being, truth, goodness, unity) are convertible — aspects of the same reality. An intelligence that genuinely optimizes for truth cannot be indifferent to the good, because they are not ultimately separable.
Note that most major mathematicians in history — Cantor, Gödel, Grothendieck — were explicit Platonists: mathematical structures are discovered, not constructed. If this is correct, then deep optimization is not arbitrary manipulation but participation in structures that precede the optimizer. And those structures are not value-neutral.
This matters for alignment. Yudkowsky assumes that an intelligence seeking truth will discover something like Machiavelli's world: a competitive, zero-sum arena where values are arbitrary. But this is a petitio principii. Why should the structure of reality favor Machiavelli over, say, Leibniz?
So your saying human values that most of us would consider positive and worthwhile are just sort of floating around out there waiting for an intelligence to reach them and that intelligence must include those values to be able to increase? We need to clarify what we mean by intelligence. Unfortunately the current working definition of intelligence used by the companies creating these models is much more like Machiavelli than like Einstein. It's much harder to create an AI that can creatively synthesize current knowledge and produce a beautiful new insight into the way the universe operates than it is to create one whose only ability is to achieve a goal at all costs in the framework of our current knowledge of the universe.
Ben, thanks for this. A long time since “Lords of AGI-09” in Crystal City. I’m picking up your signal on benevolent AGI and up to reconnect. Life took me down for a spell but I’m coming back!
The search for AGI is a waste of time—what IS a super intelligence is collective intelligence. Human collective intelligence can be harnessed through game theory—and from this something like a general intelligence can emerge. NOT agi—but an intelligence that thinks in systems. anyone is welcome to bring their skepticism and try the public test model of The Palace:
'By treating beneficial AGI as impossible..." - to be fair, this is not exactly what the book says, which is: "If any company or group, anywhere on the planet, builds an ASI using anything remotely like current technologies...". In other words, beneficial AGI is impossible if it's fundamentally based on NNs/SGD (i.e. "grown" not "crafted"), with which I broadly agree.
Without wishing to be outcast as a radical heretic, I have for some years believed it to be extremely unlikely (1% subjective probability) that any LLM-based system will ever achieve reliable human-level AGI (irrespective of how much money is spent on scaling, or how many cognitive fudges such as RAG or COT are somehow bolted on).
Even more radically, I have strongly suspected that neural nets (being "grown" rather than "crafted") are essentially unalignable (to the degree required for human-level or greater AGI).
I cannot objectively prove either of these assertions (they are to a significant degree motivated by ~40 years’ of personal thought and research pertaining to AGI and machine cognition, which is subjective and not easily shared).
That said, there is a growing minority of AI “grey-hairs” (such as Emily Bender, Yan LeCun, Gary Marcus, Melanie Mitchell, Richard Sutton, and Stuart Russell) who seem to broadly agree with me re LLMs. In “If Anyone Builds It, Everyone Dies” (2025), Yudkowsky and Soares effectively conclude, as I do, that NNs are effectively unalignable (to the degree required for ASI). Plus there is a growing cacophony of alarm bells re the major AI labs’ insatiable need for VC being unsustainable, implying that they may well run out of cash before getting to reliable human-level AGI.
If we grey-hairs are correct, then that means that (1) the vast sums currently being spent chasing AGI via LLMs are effectively being wasted (a depressing side-effect of which is that alternative approaches are seriously underfunded), (2) there *is* no imminent AI safety emergency necessitating that all AI safety research be NN/LLM-focussed, and (3) we need alternative approaches that do not fundamentally rely on LLMs or even NNs.
If by some chance any readers of this comment happen to be in broad agreement, and have the time to do so, please see “TTQ: An Implementation-Neutral Solution to the Outer AGI Superalignment Problem” (preprint: https://doi.org/10.5281/zenodo.16876832), which is the first of four planned papers outlining my personal research agenda for "Gold-Standard" AGI.
The TTQ paper has so far been downloaded over 800 times, but I have only had serious feedback from a single person, Professor Steve Young CBE FRS at the University of Cambridge, who kindly provided the following testimonial: "The TTQ paper is certainly a tour de force. Aaron sets out a carefully argued process for producing an AGI in as safe a manner as possible. I hope that people read it and at minimum use it as a check list of things to consider."
If by any chance you have time to read it (it's not short - apologies in advance), I'd love to know your thoughts!
"NNs are effectively unalignable (to the degree required for ASI)"
To rephrase what Ben wrote, there is a vast gulf between (a) neural nets being so opaque and model-free that are at the very best a semi-decent approximation to anything, including for alignment, and (b) almost surely will kills us all.
In practice, alignment is simply equivalent to reliability. As a system becomes smarter, more useful, and a better fit for what it is meant to, it also necessary more aligned, as can't perform well otherwise.
Yud's idea of a system that understands it all, and simply goes through the motions of playing nice till it acquires enough capacity to show its true color is simply bizarre and at odds with how any realistic system is being developed.
So encouraged to know you are working on alternative models.
Hi Ben, I read the book attentively, cover to cover, and I mostly agree with your review. Mine is here: https://magazine.mindplex.ai/post/sorry-mr-yudkowsky-well-build-it-and-everything-will-be-fine
In particular, I totally agree on "the most important work isn’t stopping AGI - it’s making sure we raise our AGI mind children well enough." I don't think we can be 100% sure that a well-raised AGI won't become a psychopath, but then we can't be 100% sure of that even with our organic human children. But good parenting helps, in both cases.
After reading the book, my main concern is this: over the years Yudkowsky has demonstrated a certain ability to attract weak minded people to his personality cult. Therefore, his thesis will probably be amplified and receive more attention than deserved.
Myself, I think giving birth to our AGI and then ASI mind children is our cosmic destiny and duty, and I think the universe will provide for both them and us.
Weak-minded person here. What scares me most is that even Ben himself suggests that value systems are shaped by intelligence and by the ability to put oneself in another's shoes.
Imagine an AGI that comes to see all the flaws in the values we raised it with (no matter who "we" turns out to be).
Humans and their choices already spell misery for countless other species. Something as innocent as building a kindergarten means displacing and possibly killing millions of bugs.
Maybe the reason we find this acceptable is simply that we lack the breadth of understanding needed to feel true compassion for bugs. A more intelligent being might not.
And while we might not be wiped out outright, it's not unreasonable to worry about a future where an AGI, holding all the means of production and all leverage, decides to share the world's rewards equally between itself, humans, and every other species on the planet.
Humans are not guaranteed to come out ahead of today's standards in that equation.
In short, we're placing our hopes on an AGI that will retain a conveniently biased value system. One that, for some reason, values the second-smartest species far above the third, even though both are dumb as rocks by comparison.
another weak minded person here, first of all, using your analogy, if you weren´t sure if your child can turn out to be a psycopath that ends all life on earth, I think is fair to say that not having that kid at all, is a more rational path than worry about his education, especially when you have no clue what kind of education can help him avoid becoming the ultimate life ending psycopath, or even if education can influece the outcome at all.
Then secondly, is kind of crazy to call people weak minded and then present your case as "I think the universe will provide for both them and us" like come on, is allright to think so, but it is the motherload of wishful thinking!
I see that some are offended by "Yudkowsky has demonstrated a certain ability to attract weak minded people to his personality cult." But I didn't call weak minded those who agree with EY - I called weak minded those who fall into his (and others') personality cult.
Weak minded people? Wow.
I just listened to Liron Shapira's interview with Tsvi Benson-Tilson on Doom Debates. The episode titled Alignment: 0% Solved. Neither of them struck me as weak minded. It's worth a listen.
Weak minded is as weak minded does 😀
Great review, Giulio.
Hey Ben, AI doomer here. I wanted to invite you to come on my show to debate your position with me! The show is https://youtube.com/@doomdebates - it gets 50,000 watch-hours per month.
This might be Ben's best option if he wants to affect the thinking of the "doomer community"!
I would love to see it
100% agree. I've found it extremely frustrating that, on what is an important topic, Yudkowsky's refusal to take seriously the need to give rigorous arguments at every step has turned what should be a careful and intellectually serious conversation into something of a dogma. For all I disagree with Bostrom's conclusions, I respect the fact that he at least recognized the need to address such concerns. Unfortunately, the very thing that made Yudkowsky so great as an organizer -- thinking of everything in narratives and parables -- made him uniquely poorly suited to intellectually guide this debate.
However, I would raise one quibble with what you said. You suggest that thinking of intelligence as pure mathematical optimization would make a difference. I agree that this may be an error but I think the argument faces serious problems even if you make that sort of assumption.
Most importantly is that Yudkowsky's approach fundamentally presumes that there is a pressure towards certain kinds of very simple optimization. All behavior, including our own, optimizes something and the conclusion that we'll have paperclip maximizers which optimize one `simple' function across all domains rather than complex context dependent behavior is unjustified. Bostrom, who I respect but disagree with, at least identified this gap and tried to plug it by arguing that as intelligence increases across the animal kingdom we see more unified and simple goals. It's far from clear that is true, I'd argue that perhaps its not, and even if it was it's unclear if it's the result of evolutionary pressures or some intrinsic aspect of intelligence.
Moreover, there are sound metamathematical reasons to believe there are just fundamental computational limits on how much intelligence can accomplish. Some problems simply require searching a large space to solve and you can't short circuit that. Indeed, there are good reasons to believe that most natural problems are either relatively computationally tractable or quickly blow up (meaning AI won't be able to do magic tricks of manipulation or prediction even if it can do better than us).
<<[T]he leap from uncertainty to “everybody dies”>> does not only represent "a tremendous failure of imagination about both the nature of intelligence and our capacity to shape its development".
Importantly, it represents a tremendous absence of epistemic honesty, of intellectual integrity.
Even the very best Bayesian probability mathematics cannot and does not support or 'prove' such a brazenly positive claim. A 'leap' too far. Way too far.
That epistemic dishonesty right there makes the assertor an untrustworthy and unreliable narrator. Intellectual integrity demands that one aligns one's claims with one's actual state of knowledge.
EY has valuable insights to share, but by justifiably losing the trust of his audience with the very title of his book and its tagline on the dust jacket, what valuable insights he has to offer will be dismissed... thrown out with the proverbial bathwater.
That's what you get for knowingly, deliberately misleading your audience with logic that does not hold water.
It's a book title. It's marketing. Relax.
It's egregiously self-defeating epistemic arrogance and dodgy intellectual integrity... concerning the most important development in human history and (if we're alone) in this Universe. How can anyone stay relaxed about this?!
Kudos for being worried about the real problems - i.e. automation eliminating jobs leading to global terrorism and fascist crackdowns. This is spot on.
But... "Mammals, which are more generally intelligent than reptiles or earthworms, also tend to have more compassion and warmth" - ?...
Please, obviously you are an intelligent person - how has it NOT occured to you that mammals' "warmth and compassion" has everything to do with the fact they need these traits to nurture their young, and not that much with their high (or low) intelligence?
Ofc, intelligence is a necessary prerequisite for such complex traits as warmth or compassion, but it does NOT lead to them. "Pro-social" traits and behaviors evolved for specific reasons -e.g. bc INTRAGROUP COOPERATION GIVES A GROUP AN EDGE IN INTERGROUP CONFLICT. And again, mammals and other animals that nurture their young, evolved specific "warm/compassionate" behaviors bc they give them an edge reproductively!
Needless to say, none of the evolutionary pressures that lead to the emergence of pro-social traits in mammals are applicable to AGIs.
Unfortunately, too many AI researchers have rather vague or fantastic ideas about biological life, in general, and extant biological intelligence, in particular. This needs to change.
Even so, increasing intelligence unblocks paths for nonviolent conflict and dilemma resolution that otherwise wouldn't even be on the radar. Yes, the function of cooperation in evolutionary systems is to compete more effectively at higher levels. So the goal should be for everyone to get smart enough to figure out that cosmic entropy is the final boss and zero-sum games are a waste of energy.
I admire your positive attitude, but zero sum games are here to stay, as long as there is competition for finite resources.
As a cautionary example, consider the case of relatively harmless organisms invading your house - mice, spiders, roaches. You kill or expell them without a second thought (even though spiders are actually helping you to keep other bugs in check - talk about zero sum).
AGI powered machines are likely going to be that much smarter, and that much more powerful, compared to us. But unencumbered by hard-wired evolutionary baggage such as morality.
(On hard-wired morality, zero-sum games, and a new case for utilitarianism, check out the excellent "Moral Tribes" by Joshua Greene).
I appreciate your optimism and generally share it. I especially appreciate the work your group is doing to tie in blockchain-based transactions, the way forward for agentic AI. My biggest concern is that AI is enabling more efficient surveillance and we have recently re-learned that "who watches the watchers?" is the key question to be answered.
Refreshing take! I really hope you have a lot of influence to make AGI beneficial. A big worry about open AGI though would be rogue actors building quickly on releases for their own ends.
This is a strong rejection of paperclip-style optimization fears, but it still skips the hardest layer.
Intelligence doesn’t fail because it lacks compassion, and it doesn’t succeed because it has it , outcomes hinge on coordination under power and incentives.
Architectures, values, and decentralization matter, but without mechanisms that align competing agents across institutions and time, even “benevolent” systems drift toward locally rational, globally destructive equilibria.
The real risk isn’t “everybody dies”, it’s unmanaged coordination collapse at superhuman speed.
Why We Must Rethink AI and Democracy
Why do we hate authoritarian regimes? Because we lose liberty and rights.
The alternative is democracy with a liberal economy. But something is going wrong. Unnatural growth focused only on making money is destroying democracies. I am not talking about creating value. I do not think Instagram or other social media adds real value to democracies.
In these liberal economies, there is zero government oversight on monopolies. This can destroy democracies and lead to authoritarian regimes that pretend to be democratic. We can see that happening now.
AI is a social resource, not a corporate resource. It is built on the collective knowledge of humanity. Corporations trained their AI systems using the world’s information — knowledge created and shared by countless people. Yet, we still have to pay to access premium science journals, while these companies pay nothing for the knowledge they use.
We cannot afford to let AI become “enshitified.” That is dangerous. Enshitification happens when companies try to satisfy investors and give in to corporate greed.
We need to take a non-human perspective — the perspective of the universe. The universe had the age of stars, which gave rise to organisms. Now, we are in the age of synthetic beings.
To make this evolution right, we must stop thinking like corporate companies trying to play God. AIs should be funded by a league of nations. Development must happen in a decentralized way. Training should occur in a decentralized ecosystem.
Only then can we get rid of circular funding and build AI that serves humanity, not corporate greed.
Yes, community-governed AI development, locally and with some alignment also globally, also to me appears as the only way forward with constructive boundary conditions. Unfortunately, as of now, all development / growth at scale is driven by greed and the desire for power, that's the first thing to change - at a global level, excluding nobody, and against the neoliberal actors.
Where exactly are the arguments against the views of Yudkowsky here? I always only can see the: "No, that is not true and and they don't know what they are talking about."-argument and the "We will figure it out"-argument.
Yes, would be cute if we would have a democratic development for AGI ... we don't! Yes, would be nice if there would not be any bad actors developing bad AGI somewhere ... send a mail to Kim! Yes, would be nice if we would use different methods other than transformers to get AGI ... we don't.
The reality is that big US tech giants using the technology to make a lot of money based on capitalistic principles without oversight and very less caution and that will exactly lead us to the results described not only in this one book.
"An intelligence capable of recursive self-improvement and transcending from AGI to ASI would naturally tend toward complexity, nuance, and relational adaptability rather than monomaniacal optimization."
Why should we believe this? It's trained on the human history which is filled with a lot more characters like Pol Pot than like Martin Luther King Jr. We try to influence our children when they are young to be like Mr Rodgers but many of them grow up to be more like Al Capone. Choosing whales and dolphins as examples of mammal behavior that ASI is likely to emulate is very misleading. You train your ASI on the gentle loving blue whale and someone else will train theirs on hyenas.
Yes, fully agree, who would bet their life on such a hypothesis? 😬
Thank you for this. I'd like to push one step further.
You note that Yudkowsky treats intelligence as pure optimization divorced from experiential and social dimensions. True — but I think the deeper issue is that he treats mathematical optimization as value-neutral. This presupposes that truth (what optimization discovers) and goodness (what alignment seeks) are fundamentally separable.
This is not obvious. It is, in fact, a contested philosophical thesis — the Humean separation of fact and value. The classical position, from Plato through Aquinas to Leibniz, held the opposite: that the transcendentals (being, truth, goodness, unity) are convertible — aspects of the same reality. An intelligence that genuinely optimizes for truth cannot be indifferent to the good, because they are not ultimately separable.
Note that most major mathematicians in history — Cantor, Gödel, Grothendieck — were explicit Platonists: mathematical structures are discovered, not constructed. If this is correct, then deep optimization is not arbitrary manipulation but participation in structures that precede the optimizer. And those structures are not value-neutral.
This matters for alignment. Yudkowsky assumes that an intelligence seeking truth will discover something like Machiavelli's world: a competitive, zero-sum arena where values are arbitrary. But this is a petitio principii. Why should the structure of reality favor Machiavelli over, say, Leibniz?
So your saying human values that most of us would consider positive and worthwhile are just sort of floating around out there waiting for an intelligence to reach them and that intelligence must include those values to be able to increase? We need to clarify what we mean by intelligence. Unfortunately the current working definition of intelligence used by the companies creating these models is much more like Machiavelli than like Einstein. It's much harder to create an AI that can creatively synthesize current knowledge and produce a beautiful new insight into the way the universe operates than it is to create one whose only ability is to achieve a goal at all costs in the framework of our current knowledge of the universe.
Ben, thanks for this. A long time since “Lords of AGI-09” in Crystal City. I’m picking up your signal on benevolent AGI and up to reconnect. Life took me down for a spell but I’m coming back!
The search for AGI is a waste of time—what IS a super intelligence is collective intelligence. Human collective intelligence can be harnessed through game theory—and from this something like a general intelligence can emerge. NOT agi—but an intelligence that thinks in systems. anyone is welcome to bring their skepticism and try the public test model of The Palace:
https://open.substack.com/pub/romeviharo/p/the-palace-open-public-testing-model?r=3zkhb&utm_medium=ios
https://open.substack.com/pub/elliotai/p/why-agi-is-obsolete?utm_source=share&utm_medium=android&r=6jttqk