The Anthropic Fable Farce
The Very Predictable Problems with Crippling or Export-Controlling Frontier Intelligence
Anthropic shipped its most powerful model on a Wednesday and was ordered to pull it by Friday. That forty-eight-hour farce is probably about as close to a controlled demonstration as we’re going to get of why advanced AI can’t be effectively locked down in the actual modern world — and why open and decentralized is the only kind of safety that survives contact with the real world.
What happened this week with Anthropic, the US government, and the Fable model is about as close to a controlled experiment as real life ever hands you. For the last week I’ve been writing and talking about the relative safety of open versus closed, decentralized versus centralized AI — and the gist of my argument has been that even if you want to lock down a closed, centralized model, that is simply not how things work given the technology and the geopolitics we actually have. And then the world went and illustrated the point for me quite beautifully, over a period of roughly forty-eight hours.
On Wednesday, Anthropic released Fable 5 — the first widely available model in its Mythos family, its most powerful line. By Friday evening the US government had issued an export-control directive barring any foreign national from accessing it — not just foreigners abroad, but any foreign national anywhere, including Anthropic’s own non-citizen employees. Faced with that, Anthropic had no real choice but to pull the model offline for everyone.
As an aside, I have a small personal stake in this particular policy mayhem…. I was born in Brazil, though I grew up almost entirely in the United States; I’m a dual citizen, which is to say I am a foreign national and also a US national. Am I allowed to use Fable? I have no idea — and it doesn’t matter, because nobody can. I log into Claude and it’s simply gone; I switch the little selector back to Opus and carry on. If this policy sticks, the logical endpoint is KYC for frontier AI: upload your passport to use the latest model, the way you open an overseas bank account online.
A model too dangerous to release — until it wasn’t
I’ll try to summarize the backstory concisely, as most readers probably know it. Not long ago Anthropic put out Mythos, a model especially good at hacking — finding and chaining software vulnerabilities. This is part of an ongoing churn in the security world, where models keep getting better at breaking into systems and, in parallel, better at hardening them: a security arms race that is both inevitable and genuinely worrying, and one I suspect gets tamed in the medium term mainly by correct-by-construction software rather than by any one model staying ahead. Mythos is very good at hacking. So are other models. The next generation will be even better.
From the start I suspected the breathless framing — this model is too dangerous to release — was a blend of sincere concern and shrewd positioning. There’s an obvious marketing logic to telling people a thing is too dangerous for them to have, and then, a couple of months later, handing it over: now they feel they’re holding the most dangerous object in the world!!! Exciting !! …
Go back to the prehistory of the LLM space and OpenAI did exactly this with GPT-2 in 2019, agonizing in public over whether the model was too dangerous to release and then releasing it to a chorus of “this must be incredibly powerful.” That model was in fact a real step forward. So is Mythos. Neither is the runaway supermind that turns into Skynet. What exact mix of sincerity and theater is in play in these companies I can’t say — I know people at the labs, but not the ones setting this policy — but the pattern is familiar.
Fable is essentially Mythos with part of its range disabled: ask it something sensitive — cyber, bio, and the like — and it quietly falls back to a weaker model. That’s a perfectly reasonable thing to do. And we never really know what blend of models sits behind any of these branded services anyway; anyone who codes with them has watched a model get suddenly smarter, then dumber, then smarter again, month to month. Playing games on the back end is not new, and it fits the marketing narrative neatly: this is the most dangerous thing in the world — do try it.
The backstory: Anthropic versus the Pentagon
Another more interesting thread has been running between Anthropic and the US government. A few months ago the two were locked in a standoff: the Pentagon wanted to use Claude “for all lawful purposes”, with no restrictions, and Anthropic held two red lines it would not drop — that its models not be used for mass surveillance of Americans, and not be wired into fully autonomous weapons that select and engage targets without human oversight. Anthropic refused to remove them. The administration’s response was unusually heavy: an order for federal agencies to stop using Claude, a “supply chain risk” designation of the sort normally reserved for companies tied to foreign adversaries, and a six-month phase-out. Where Anthropic drew hard ethical lines and was punished for it, its competitors were notably more accommodating about military use.
A genuinely hard question about autonomous weapons
I’m going to take a moment to be pedantically careful on this “Anthropic versus Pentagon” matter — because while this bit about military uses is not the central proximal point of the Fable farce, it’s a key thread in the fabric of the story … and it’s totally not the simplistic morality play it’s sometimes made out to be.
I’m absolutely a pacifist at heart — but a practical one. I’m also (secular) Jewish, and I have a lot of sympathy for Einstein, who was a pacifist right up until he watched what Hitler was doing and concluded that there are moments when refusing to shoot back is not the compassionate choice — that clinging to a principle for its own sake can be a kind of egocentric or ideological pathology rather than a kindness to your tribe or your species. “AI should never kill without human supervision” sounds unimpeachable until you try to build the edge cases.
Picture a drone, cut off from the network, watching a man who is seconds from pressing a button that will launch a weapon that kills a million people. The drone’s hard rule is that it may not use force without a human in the loop — and there is no human in the loop, because the link is down. So the rule, faithfully followed, lets the million die. In that situation the drone should act. Such cases are rare, but they aren’t imaginary, and once you admit even one of them the absolute version of the rule is gone, and you’re left with the genuinely hard question: who gets to make that call?
That’s where I have real sympathy with both sides in this sort of argument. In some respects I might trust Anthropic’s engineers over the Pentagon on some particular ethical judgments. But the US military are not naïfs about this; they are actually among the few who have wrestled with the ethics of autonomous force for decades. I spent nine years in DC, most of it consulting for Army intelligence (among other government agencies), and I saw how much serious, funded, careful work has gone into exactly these military AI ethics questions — with vastly more nuance, frankly, than the labs are showing right now. (Which makes total sense— military ethics is a core part of the US military establishment’s business, even though they have made their share of mis-steps; whereas the frontier LLM labs are very busy with other things and military ethics is a side-topic for them.).
So I can feel where Dario Amodei is coming from (“I built this software to help people, not kill people!!” - my gloss of his line of thinking, not a quote), and I can equally feel the frustration of DC people who’ve studied autonomous-weapons ethics for thirty years and are now told their policy will be set by relative newcomers to the military-ethics field who happen to have trained the big AI models of the day and think this qualifies them to tell the army, the cybersecurity world, the job market and everyone else what to do. These are very tricky matters and there are decent points on both sides of the opposition.
Why the ban won’t work, even if it’s well meant
But still… Banning foreign nationals from using an AI model just won’t work for any valuable purpose, in anything vaguely resembling the current situation.
Just to make it utterly super crystal clear: as I said above, I can see why some folks in the US gov’t might WANT to ban bad guys from using our best models, and might think banning foreign nationals is an acceptable cost to achieve this. If the single most cyber-capable model on the planet sits on an open tap for anyone on Earth, then your adversaries are among “anyone on Earth.” Picture your own military hamstrung by a legal fight with its own AI vendors while China quietly reconstructs your best model through some mix of reverse engineering, espionage, and hacking — and then uses it freely. That is a real national-security worry, not a cartoon. From my Beltway years I came away with a deep understanding there is more genuinely nasty activity being defended against than most people imagine. So yeah, I do take the US government’s motives seriously.
The trouble is that the measure that’s been taken doesn’t effectively address the thing it’s worried about. Taken literally, “no foreign nationals” means KYC for AI — and anyone who’s spent time in the crypto world knows how leaky a barrier KYC is. Long before modern AI, criminals routed around it by simply buying other people’s verified accounts; people open accounts and sell them, and it’s been a staple of money laundering for decades. Now add AI to the attacker’s side. The “liveness check” — hold up your passport, turn your head so we can see you’re real — was never strong, and today a deepfake can paste someone else’s face over a live participant convincingly enough to satisfy the bank on the far end. That technology already exists.
So KYC won’t stop a terrorist network, and it won’t stop the Chinese or Russian states, who are well documented as able to defeat identity controls at scale. It will stop a random user in England or Portugal. And it makes things worse in another quite specific way: to run KYC, the labs must collect and warehouse everyone’s passports and biometrics — one enormous pile of the most sensitive identity data imaginable, which is exactly the kind of honeypot that gets breached (by software like oh, say, open Chinese Mythos clones) and then feeds the very identity-forgery economy that defeats KYC in the first place.
And the capability leaks anyway
Suppose, though, that KYC worked perfectly — it doesn’t and won’t, but suppose. It still wouldn’t keep the capability away from China, because the capability doesn’t stay put. Whatever the largest, most expensive model can do, a smaller and cheaper one tends to manage a few months later; we’ve just watched open models with a few billion parameters match the top near-trillion-parameter systems on important classes of tasks. If someone wants a small, cheap model that hacks at the Mythos level, I have little doubt one exists within months — some people claim to have them already — and the best open models in the world are now coming out of China regardless.
The likely result of this ban – even if it holds – is not that adversaries are denied anything; it’s that, three or six months from now, China releases a Mythos++ and hands it to everyone, including those random users in England and Portugal … and world market share drifts from the US toward China — all in the name of blocking terrorists who were never going to be blocked this way.
As a footnote, for the technical tasks I do, my few days with the Fable model before it disappeared convinced me it was significantly smarter than the previous Claude models, but not up to the level of GPT-5.5-Pro (which is very slow and expensive, but for really deep tasks that’s OK). I didn’t try to jailbreak it unlock its hidden Mythos-level hacking skills, though….
The own-goal, and the deeper mistake
How is Anthropic faring in all this? On one reading they shot themselves in the foot: they spent months telling the world how world-historically dangerous this technology is, and the government eventually answered, fine — we agree it’s too dangerous, so we’re export-controlling it — leaving Anthropic to argue, awkwardly, that its flagship isn’t really so dangerous after all. We’ve seen this movie, right?
On another reading this may fall into the “no such thing as bad publicity” category; conceivably all the noise, and the likely eventual rollback, only burnishes the company’s aura of power ahead of a rumored trillion-dollar IPO. Marketing is not my expertise, so I’ll leave this one open.
What I am confident about is the deeper mistake underneath the whole episode, which is arms-race thinking: we must keep the edge from our rivals. I understand the fear. There really are bad actors; US intelligence really does stop genuinely terrible things — I spent years building software meant to help with parts of that (mostly information management, and deliberately not the parts that hurt people directly). But given modern technological and geopolitical realities, you cannot, in fact, deny a technology like this to your geopolitical rivals, except narrowly and briefly.
You don’t win an arms race in leg irons
So one lesson of this Fable farce is: what arms-race thinking actually buys you is self-injury in the part of the race you can’t easily opt out of. To whatever extent an arms race is unavoidable given the geopolitics, the last thing you want is to run it crippled — and blocking your allies, alienating your own users, and pushing everyone toward a black market in stolen accounts is precisely running it crippled.
The arms race is a trap. You don’t win it by running in leg irons. The way out isn’t to play it better; it’s to change the game.
Open, decentralized, accountable
The only apparent and viable alternative, as I keep saying, is “open, decentralized, and accountable.” You can’t put the mind in the bottle — perhaps in some science-fictional world, but not in the one we actually inhabit, with these technologies and this geopolitics. What you can do is build a system whose security comes from how it’s constructed rather than from a wall and a central gate. Decentralized identity and decentralized reputation instead of Anthropic holding one giant database of everyone’s passports for a hacker to lift. Many distributed eyes on what bad actors might be doing instead of a single watchtower with a permanently occluded view. No single chokepoint to seize or breach.
Yes — if I build open, decentralized AI, the US military will be able to use it, and so will others, and someone, somewhere, will try to do something bad with it. But the risk is coming either way; the bad guys get the AI either way. The point is that an open, decentralized network can be engineered, using intelligent blockchain-based design — I wrote recently about “cryptographic laterality”, how you make such a network genuinely hard to fork or copy — to resist exactly the theft and replication that centralized functionality invites. And that replication is, by now, almost routine, enabled by the very hacking capabilities Mythos has been pushing forward. The centralized fortress is the thing that gets copied. The decentralized network — if built correctly — can be the thing that doesn’t.
A clarifying farce
A year from now I suspect the pulling of Mythos will look like a minor, clarifying farce — the way we now look back on the GPT-2 “too dangerous to release” episode. This was the week the lock-it-all-down theory tripped over its own feet in public, inside of seven days.
My hope is that we come out of this little piece of absurdity with a little more clarity that the safe future was never the walled fortress. The safer future is the open, decentralized network — not because that future is risk-free, nothing is, but because open, decentralized safety is the only kind that makes sense in the real world: the only kind that is both achievable and resilient.
The walled garden here didn’t fail because terrorists broke in and made off with the dangerous goods. It failed because it couldn’t keep itself running effectively for forty-eight hours.
This story isn’t over, and I’ll probably have something more considered to say in a week or two. But that’s my seat-of-the-pants first reaction.

