It is tempting when writing a book review to treat the book as a moment for reflection: on the author, the topic, the cultural moment. But when the book in question’s central claim is that we are all going to die horrible deaths if tech companies succeed at their current plans, the only question that matters is whether or not the book is correct about that claim.
“If Anyone Builds It, Everyone Dies,” a new book by Eliezer Yudkowsky and Nate Soares, argues that OpenAI and other major AI companies are, right now, trying to build AI that is smarter than humans at everything — and if they succeed, it will mean the end of life on Earth.
Yudkowsky, once an artificial intelligence researcher and credited as an inspiration by many of his friends — and also surprisingly many of his enemies — has been arguing this point for almost two decades now. Soares, the president of the Machine Intelligence Research Institute, which Yudkowsky founded, did about half the work of turning Yudkowsky’s polarizing style and extraordinary verbosity into a readable work for a popular audience.1
There is a lot you can say about this book, to put it mildly.2 There is a lot you can say about Eliezer Yudkowsky as a person.3 There is even more you can say about the AI-world politics that led to CEOs claiming that they’re trying to build artificial superintelligence in the first place, and the changing AI-world politics that led them to recently shut up about it (without much changing their research programs).
But when we’re talking about whether or not we’re hurtling toward mass human extinction, I don’t think any of these topics are really worth my — or your — time.
The thing that a number of CEOs have told us they intend to do — build a “superintelligence” that surpasses humanity in every way — is, in fact, ludicrously dangerous. That they are mostly escaping accountability is precisely because it is so ludicrous that people don’t take it seriously enough to process it as dangerous.
Most people are less scared than Yudkowsky not because they have a committed conviction that building AI superintelligence would go fine, but because they don’t really believe that anyone can or will actually do it. Many people have also mistakenly taken comfort in a blizzard of justifications that don’t really hold up on close inspection, which the book does a decent job of wading through.
But it ends up proving a fair bit less than what it states, which made the book a pretty maddening experience to read — especially as a person who agrees with about half of what the authors have to say and has been arguing the other half with Yudkowsky directly for years now.
For those of you who haven’t been following the debate as closely, let’s catch you up:
“If Anyone Builds It, Everyone Dies” has three explicit premises and, I will argue, two equally important implicit premises. The explicit premises are argued at length; the implicit premises are present throughout the book but barely argued at all. The explicit premises are:
A superintelligent AI would be capable of destroying humanity if it wanted to.
Such an AI almost surely will want something that, as a side effect, destroys us.
All existing plans to prevent our extinction are clearly doomed.
The implicit premises are:
There is no near-term way to build an AI that is smarter than humans — superintelligent AI — without imbuing it with the power to develop and pursue long-term goals. We cannot build something that is superintelligent at honestly responding to user queries but still lacks long-term goals beyond that.
It makes no sense to wait and see if we encounter any more red flags. We have already encountered many and they are waving wildly. Therefore, any plan worth doing to stop it is worth doing now.
I wish I could confidently say these implicit premises are fatally flawed, so the whole book is untrue, and you don’t need to worry, and you can think about nicer things like the possible end of the American republic. But I can’t say that. I can say that I wish the book had done a great deal more to spell these latter assumptions out and actively defend them. I can say that for two people desperately trying to warn the public about the end times, I wish they had hired a better editor.
It is true that a superintelligence could probably kill us
Let’s start where Yudkowsky and Soares have the strongest argument, and also where they open the book: If we build a superintelligent AI smarter than any human who has ever lived, that can do anything that any human has ever done, that can form long-term goals and execute on long-term plans, and doesn’t specifically prefer that we live happily ever after, then it would be game over. Dinosaur, meet self-inflicted meteor.
AIs can already send other AIs messages that appear innocuous at first glance — just a sequence of numbers — but convey coded messages to the other AI — in one case, their favorite animal. Additionally, AIs have already been spotted trying to blackmail, threaten, or coerce humans. Our current AI systems are not very smart4 but still sometimes convince humans that they’ve discovered shocking new breakthroughs in physics or are being stalked.
If you’ve interacted with ChatGPT in a browser window, you are probably thinking that it cannot kill you since all it does is answer questions. But companies have also built AI “agents,” which act independently in the world, running Twitter accounts or vending machines. Some have persuaded people to give them tons of money. Some have organized events. They’re not good at it, of course. But billions of dollars are being devoted to making them better at it.
Companies want to hire “AI employees” who can act independently in all the capacities that human employees can. If it can be done, it will be done (and it probably can be done). Even to answer questions in your browser window, the AI sometimes needs to come up with a research plan, conduct that research, problem-solve when it encounters errors, and write up its results for you.
Recently, I asked an AI to mod5 a computer game for me, which it did flawlessly on the first try. AIs do not just respond to user queries; they do stuff. They’re getting better at it.
In the sci-fi parable that makes up Part 2 of “If Anyone Builds It,” Yudkowsky and Soares describe one possible scenario in which a very intelligent entity could bring humanity to its knees: Having explored many avenues to gain power, a superintelligent AI designs and releases a highly transmissible, seemingly mild virus that actually causes hundreds of rare cancers that it knows humanity will turn to the superintelligence to cure. They go to pains to explain that it’s not, of course, that this scenario in particular will happen — just that if you’re the smartest entity around, there are lots of ways to pull off something that no human dreamed possible. They linger on this point quite a bit, and if this is the point you’re most skeptical of then I do think the book is a worthy read.
This part of the AI apocalypse thesis never seemed implausible to me in the first place though, perhaps because I am a parent. Parents have a massive advantage in life experience and general planning ability over their kids, and it has always been pretty obvious to me that if any being existed who had a comparable advantage in life experience and planning ability over me, I’d certainly better hope they were friendly.
Young kids will try deceiving one parent about what the other said, not realizing that we can text each other; they’ll be absolutely astounded by our ability to infer what happened when we did not directly see it (merely from the pool of superglue on the floor and the further superglue all over the 5-year-old’s hands and the missing superglue from the garage).
When you have limited life experience it’s almost impossible to guess which things are dead giveaways to someone who knows the subject area. You can take advantage of their inattention, but you basically can’t trick them. If you come up with a clever plan to trap them in their room so that you can eat ice cream unattended, it will not only be possible for them to defeat the plan, it will generally be super easy for them to do so because you didn’t realize what it would take to actually trap them.
Yudkowsky and Soares are not parents, and they argue instead from the ease with which a modern military would mop up any 1800s army, or with which humans achieved dominance over all the other species of animals: Yes, sometimes an individual dog or chimp or tiger can eat a human, but humanity as a whole not only could kill every member of those species but in fact has to work quite hard not to do so by accident. Their argument holds up, but I prefer mine.6
General knowledge of the world and planning ability really do matter. A lot of human plans to imprison or deceive a superintelligent AI fundamentally read like the efforts of a crowd of 5-year-olds to stage a cookie raid. It won’t work, because planning against someone much, much smarter than you is hard.
Furthermore, even if we did have a bunch of information security geniuses come together and design the perfect plan to ensure that a hypothetical superintelligence remains trapped on a single server that has power we can shut off and can only communicate with the world through carefully monitored channels with dozens of protective safeguards — would we?
Take one look at the world around you. Would that genius plan actually be followed? The U.S. is currently not even able to prevent the sale of our own extremely valuable semiconductor chips to our main geopolitical rival — the very equipment that China so desperately needs to compete in the AI arms race. Right now, all around the world, people are mindlessly copying and pasting huge chunks of text output from the AIs in order to run them completely unsupervised on our own computers. Even if there were a genius plan that would actually work to distract Mom, and even if a kindergarten class were capable of thinking of it, the gaggle of kindergartners is not going to successfully implement it.
So that’s where Yudkowsky and Soares are straightforwardly right: If there were some entity whose general planning and reasoning abilities surpassed those of all humans by as much as I surpass my 5-year-old — let alone by as much as humans surpass the rest of the animal kingdom — our plans to contain, trick, or stop that entity will not work.
But this argument isn’t that meaningful unless the authors prove that we can’t just build a superintelligence to help us cure cancer and invent cheap clean energy without also giving it the power to develop its own long-term goals and act independently in the world. This is where I think most of the intelligent disagreement with Yudkowsky and Soares lies. It’s not really that most people are imagining that we’ll build Skynet and then outsmart it; it’s that they’re imagining we just build a much better version of ChatGPT.
If I had to hazard a guess, I think the projects currently underway at leading AI labs will get AIs that are superintelligent in some respects but far from a general superintelligence. This is to accept the book’s ominous title — yes, if anyone built that, everybody dies — but bet against anyone successfully building it.
Nonetheless, this is all absurd, right? People are, in fact, trying to build precisely the thing Yudkowsky warns we should not build. Maybe they’ll fail! But they are not constraining their ambitions to just building a better ChatGPT. In many cases, they are specifically trying to build a superintelligence even though they think it could kill us all, out of optimism that along the way they can make it not want to.
Can we make AIs want the things we want?
Why would a superintelligent being just decide to kill all the humans? We don’t want to genocide every species on Earth, so why would a computer god suddenly decide to do so?
Well, Yudkowsky and Soares argue convincingly that most possible courses of action kill all humans just as a side effect (say, if you want to use all the oxygen in the atmosphere for something besides breathing, or do operations with a lot of waste heat, or ensure that no other AIs get built); AIs don’t need to hate us, just need to have things to get done and not actively want to protect us. And, the authors argue, it will be nearly impossible to train AIs to want the things we want, nor are we likely to be especially useful to them. If you’re inviting a new superpowerful elite to rule over you, you don’t want your plan for your continued survival to be ‘eh, maybe it’ll like having us around?’
Every AI you’ve ever interacted with has been pushed to behave in ways the AI company wants it to behave. On a very basic level, they are trained to respond to you and answer questions. Beyond that, at the very least, no AI company wants its main product to generate a bunch of bad PR. Elon Musk wants Grok to say the things Musk believes about politics without being embarrassingly obvious that it’s just parroting what Musk says about politics or calling itself MechaHitler. OpenAI keeps trying to turn a big dial back and forth on ChatGPT’s sycophancy, constantly looking back at the users for approval like a contestant on “The Price Is Right,” to make sure that customers love the product but do not get wrongly persuaded by it that they are super geniuses.
This, Yudkowsky and Soares correctly observe, is not the same thing as making an AI that will act in the way you want it to act. Frequently, it goes badly wrong. While the AIs are not very powerful, the examples of it going badly wrong look like Microsoft's AI chatbot’s efforts to threaten/blackmail/seduce New York Times tech reporter Kevin Roose — or the recent tragedies in which suicidal young people confided to AIs whose reactions varied from “doing everything right, but it wasn’t enough” to “actively discouraging them from seeking help.”
So the current strategies surely do not suffice: If we built an AI using just the current strategies and hoped it didn’t want to kill us, we shouldn’t be any more assured that it won’t kill us than OpenAI should currently be assured it won’t discourage suicidal people from getting help or than Musk should be sure Grok will answer the next question with Musk’s preferred worldview.
But, you might observe, if I type “what’s 2+2?” into an AI, I will get four every single time. There are certain things that we have not successfully trained them to do reliably, and some that we have trained them to do reliably. Yudkowsky thinks — and the book argues — that for a good outcome, we need to solve some problem that is much harder than getting current AIs to answer the right way every time on easy questions.
Again, I find myself going “I mean, maybe.” It is certainly true that the friendly “helpful assistant” we talk to is some enormously complicated and messy thing with extremely bizarre behaviors. It’s also true that there are wide classes of questions in which its behavior is perfectly predictable. Before I read this book, I was uncertain whether it would be impossible to shape an AI’s wants so it does not kill us. After reading the book, I remain uncertain.
All existing plans to prevent our doom are basically, well, doomed
Over the decades, Yudkowsky has been fairly successful at persuading people that AI is the most important technology of our age. Many of his converts are working in areas like preventing an AI superintelligence arms race that could destroy the world, better understanding AIs, designing safety and monitoring measures to discover if they want to kill us or not, and auditing AI companies to see if they’re behaving safely. Others have accepted that AI is going to be big and are now actively working to build it. It’s not shocking that Yudkowsky doesn’t think much of the latter group. But it might surprise you that he also thinks most members of the former group are chasing an impossible dream.
This fairly bitter schism, between people who think we can prevent AI doom through better tools to monitor, understand, and design them, and people like Yudkowsky who strongly disagree, is an undercurrent throughout “If Anyone Builds It.” I think that an audience not steeped in AI discourse will best understand it as the passionate dislike between progressive and centrist Democrats.
The “AI extinction” centrists, if you’ll forgive me, think that Yudkowsky’s faction has absurd and extreme views, and that you can’t win people over by having absurd and extreme views. And while some of the extreme views might be right, many are certainly going to turn out to be incorrect, so it’s wiser to tread lightly and focus on strategies to learn more. They want to study AI more as it gets more powerful in the hopes of confirming (or debunking) some of these more terrifying predictions. Then, they assure the progressive faction, if in fact AI is going to kill us all, of course they’ll support regulation.
Yudkowsky considers this to be a kind of blithely dangerous faux-maturity, acting how you think a sensible person would act rather than acting in appropriate proportion to the threat. If you think there’s a 50-50 chance that a plane will crash if it takes off, you ground it. You can then convene some engineers to run some tests to figure out if the problem is real, but first you ground it.
Yudkowsky also thinks that the centrists are wrong about the unpopularity of his views. The American people, he points out, hate AI.7 The problem the centrists are having, in Yudkowsky’s view, is that they say something insane like “AI might well end life on Earth” and then say “so we would like these tech companies to voluntarily agree to third-party audits so we can better understand how this risk is developing.” Yudkowsky says that in fact you can be more popular here by going more extreme: “AI seems set to end life on Earth. Therefore, ban building it.”
I have no idea who is right in the wars over public opinion. (Though The Argument will be releasing polling on several AI questions soon, subscribe to get that in your inbox soon!)
But again, I keep circling back to my original question: Who is right on the merits? There are already cases of AIs attempting complicated schemes to deceive us that have helped us gain significant understanding about these entities and their behaviors. We understand a lot more about how AIs might be dangerous than we did five years ago. The authors don’t think we can afford to wait for further signs but, again, don’t really prove their case here.
I’ll grant that we haven’t solved the problem of AI alignment and I’m certainly not sure whether we’ll have advance warning that we’re about to unleash MechaHitler on the world. Given that, rushing ahead is, in fact, insane. It’s rushing a train with no brakes down a track that we are still laying — and that some of our engineers warn is not going to hold up at all — with all of humanity on board. Most people aren’t alarmed because they’re either not paying attention or everything I’m saying sounds like the ravings of a science fiction author. If this book convinces them to go check what these CEOs are actually claiming they intend to do in the next few years, that will be an important improvement.
And yet! And yet! It is exasperating to read a book on the most important question — the survival of humanity — and have to reconstruct half its argument. It is even more exasperating to go check the chapter-by-chapter online notes to see whether an important missing step of the argument is present there.8
If this were not a very important book, I would content myself with saying that it’s not a very good one, but since I think it is a very important book, it is the most frustrating one I’ve read in several years.
No one should be building superintelligence
“We do not know how far beyond human-level intelligence we can go, but we are about to find out,” Sam Altman wrote in June 2025. He claims to believe this will go well. The first step of it going well is to “Solve the alignment problem, meaning that we can robustly guarantee that we get AI systems to learn and act towards what we collectively really want over the long-term.” Nowhere in the essay is there any indication that OpenAI is going to do this part before the superintelligence part.
It’s not just him.
“Developing superintelligence is now in sight,” Mark Zuckerberg wrote in July 2025.
Companies that are intentionally trying to create superintelligence are gambling with the fate of humanity. I don’t see why a democratic society should let them do it, and I think the only reason a democratic society is letting them do it is obliviousness.
So it seems important to say: These companies are not bluffing. It is not just marketing copy to them. They drank deeply from the well of Yudkowsky and others’ early 2000s Singularitarianism. Some of them truly believe that replacing humans with “superior” AIs would be a good thing, that we don’t even deserve to continue existing if we could be supplanted with something better (the book spends a while earnestly rebutting this). Others think superintelligence is inevitable, so if someone’s going to do it, it had better be them. Others still, like Altman, seem to believe whatever is most compelling to whomever they are speaking in any specific moment. And of course, some are in it for the chance that right before they’re turned into a paperclip, they were at the top of Forbes’ list of billionaires.
I hope all of this eventually looks silly and that no one ever invents a superintelligence and that all this money and effort and time does not make AIs all that much smarter. I hope these companies will fail and their AIs will continue to be economically useful while falling wildly short of “goal-oriented and surpassing humans in every respect.”
But for the skeptics who believe this doomsday scenario is impossible, I do not think the evidence is on your side. AI models keep rapidly improving in every way that we know how to measure.
A lot of people fall into the trap of believing every bad thing they read about AI, and, as a result end up underestimating it — believing mistakenly that it has no useful commercial applications (it absolutely does) or can’t make money (it is making tons of money). Yes, there might be a “dot-com bubble”-style collapse, but the dot-com bubble didn’t happen because the internet was not commercially important. AI can do, and is doing, original research in limited contexts.
Maybe we won’t build superintelligence. But I’m not sleeping easy and neither should you.
What can we do?
Yudkowsky and Soares want a worldwide ban, enforced like nuclear nonproliferation treaties. If every data center is accounted for, no one can build a superintelligence.
Not to be one of the unbearable centrists they’re so sick of,9 but I am not optimistic that the world will turn away from all the hope, promise, and wealth that AI researchers promise — at least not without proof that building superintelligence is possible, on the horizon, and impossible to control. If a lab might stumble into superintelligence overnight, then we can’t afford to wait for that proof — but that a lab might stumble into superintelligence overnight was the weakest part of Yudkowsky and Soares’ argument.
Still, I think that you don’t have to buy their whole argument to regard it as outrageous that anyone is even trying to build superintelligence. If Microsoft had a nuclear weapons program, I’m not sure we’d spend a lot of time going “well, they’re probably overselling their nuclear weapons program.”
If I were a candidate for president in 2028, here’s how I’d frame it: No company has the right to hand over humanity’s future to AIs.
CEOs are, out of one side of their mouths, saying they will build superintelligence to justify a new massive investment round. Out of the other side of their mouths, they are assuring Congress that everything is fine and regulation would be premature and bad for innovation. Maybe they’re lying to investors; maybe they’re lying to Congress; maybe they have the genuine gift of believing whichever words are in that moment most convenient.
That game stops. No, you may not build superintelligence. No, you may not strategically equivocate about whether you’re building superintelligence. And yes, we should partner with the rest of the world to ensure it doesn’t either.
In a bunch of ways, the most boring and straightforward progressive analysis is correct here. A bunch of people are getting absurdly rich while risking your lives. These companies started out openly admitting it, but are now silenced by the gobs of cash stuffed in their mouths. These are not their risks to take. We should stop them.
No one did the other half of the work. It’s a problem. I think the first few chapters are unusually poorly written and the back half is stronger, which is unfortunate, because in 2025, who actually finishes a book?
Every chapter opens with a parable that is usually an exasperating socratic dialogue between a character who agrees with the authors and one who does not, and ends with a QR code to a website with tons more arguments. The New York Times wrote that it “reads like a Scientology manual.”
Kevin Roose’s excellent New York Times piece is a sympathetic treatment: the eccentric whose impact on our world is extraordinarily large but not at all what he hoped for, the person who would be easy to dismiss but for how many of his odd plans have worked far better than they had any right to. He really did build an entire field of AI safety research by writing a Harry Potter fanfiction. Many of the people now doing the work he warns will kill us credit him for getting them started on it.
I know Eliezer Yudkowsky; I like him. This is not a profile of Eliezer Yudkowsky, because I really don’t think this book stands or falls as an opportunity to reflect on its author. It’s right about its core claims or it’s wrong about them.
They can do impressive well-specified intellectual tasks, but they are incredibly far from being superintelligences and are worse than most humans at many specific tasks.
Many computer games are designed to be easy for people with a bit of programming skill — which I do not have — to build on or adjust the game. Even if, like me, you can’t program at all, you can now ask AIs to do this for you.
I think “the gap in planning ability between a parent and a 4-year-old is sufficient for the 4-year-old to be unable to thwart, deceive, or imprison the parent” is a stronger claim than the claim that the gap in planning ability between humans and animals. And the introduction of “humanity” rather than “a human” as a factor invites a complicated conversation about whether there will be one superintelligence or many, and why to expect they will cooperate with each other.
Pew found that Americans are more concerned than excited about AI by a huge margin (50% more concerned, 10% more excited, 38% both concerned and excited), and the share concerned has been increasing over time — the more we see, the less we like it.
To their credit, it usually is! But the set of people who are going to check all the online supplements is even smaller than the set who will read the whole book.
I am absolutely one of the unbearable centrists they’re so sick of.




I've read Yudkowsky for almost 20 years and the author for 10, and this framing of the book is (so far) the one I would prefer to share with friends who would find straight Yud offputting. I find its being paywalled very frustrating. I suppose it's on me to spring for gift subscriptions once my free five run out, but "can I have your email to sign you up for a Substack" is still a bigger friction than sharing a link, and here of all places I would strongly prefer that friction removed.
EDIT: And the review is now free! Thank you Kelsey, Jerusalem, and whoever else makes that call!
Thanks Kelsey, excellent review.