Avoiding AGI Catastrophe, Part 2
Making It Hard to Fork a Decentralized AGI Network (via Cryptographic Laterality)
The idea I’m going to summarize here crystallized in my head last week, just a half-day after I finished writing the prequel blog post — in which I try to give my best logical, rational argument for why an open and decentralized AI network wouldn’t actually be suicide for our species, and why, given the real geopolitical situation, it’s probably safer than putting AI under some sort of realistic lock and key inside a big company or a government research lab. I’m not going to repeat that whole argument here; please read the prequel if you haven’t. But one of the main points that came up there is the one I want to dig into now: how do you decrease the forkability of a decentralized AI network?
Here’s the worry. Suppose you have a big global intelligence running across your whole decentralized network. How do you make it hard for someone to just clone a small portion of the network — fork off a piece of it — and then make their own nasty AGI?
TL;DR — Turns out there actually are pretty good math/CS solutions to this, if you're working in an infrastructure like say Hyperon + ASI:chain where your AGI language of thought is also your smart contract language, and you can judiciously insert things like multi-party computing into your cognitive workflow..
As a general principle, I’m happy for people to fork whatever I’m doing if they make something cool with it. My AI work is pretty much all open source; I publish all my ideas online. I’ve had plenty of my ideas adopted by others without attribution, but mostly if they do something cool with an idea I put out there, I’m happy. I do have an ego like (almost) everyone else, but I’m not primarily in this for the ego (or the money, or the lulz – not that there’s anything wrong with those things, but I’ve always been into math, science and engineering mainly for what might be called the cosmic spiritual sense of discovery and creation, and lately my motivation for AGI has shifted largely in the “let’s do our best to make sure the Singularity comes out benevolent” direction).
But – yeah – when you’re talking about the scenario of an AGI with robust capabilities moving quickly toward superintelligence, you’ve got to think twice about forkability … you have to start thinking about the whole spectrum of ways other sorts of entities might fork what you’re doing for their own purposes.
What my colleagues at SingularityNET and I are trying to build is a kind of self-evolving beneficial global AI brain. While I’m obviously playing a key role in this effort it wouldn’t ultimately be my thing or even SingularityNET’s thing, it would be everyone’s thing. Of course, those who play a major role in the initial creation will have a privileged position for a while, and will be able to profit from this in various ways. But still, in the end, what we want is for everyone’s thing to be more powerful than anyone’s thing. You really don’t want people to be able to take what the global brain has carefully shaped, fine-tune out the beneficial goals, and fine-tune in their own selfish or nasty goals instead.
In the prequel post I flagged this as a weak point in my argument that decentralized beneficial AGI is the route with the highest probability of avoiding human catastrophe. I still concluded that the weak points of centralized control are worse — much worse. But I kept turning over in my head: how do you actually work around this weak point in the argument for beneficial, open, decentralized AGI?
What I’m going to talk about here is a technical idea in this direction that seems like it may actually work. A first stab at a detailed write-up is here; in this post I’ll aim to convey the gist.
In the big picture the idea I’m going to convey here is nothing surprising — it comes right out of the infrastructure we’ve been building at SingularityNET and Hyperon. The new notion that popped into my head is a near-perfect application of the non-coincidence that the MeTTa programming language is both our AGI language of thought (in e.g. the Hyperon AGI framework) and the smart-contract language of the ASI:Chain blockchain we’re in the process of rolling out. So this post isn’t me suddenly deciding we need post-quantum computers in a cup of water, or converting into a deep-neural-net-only guy. It’s the details falling into place. But some of these are big details, which are (IMSNHO) quite interesting things in themselves.
Super-additivity: the whole more than the sum of its parts
The first concept I want to highlight is what you might call super-additivity. You need the whole to be more than the sum of the parts. You need the whole to be able to do things that no practically stealable part of it can do. This is a species of what is often called “emergence” (though that is a sufficiently overused work that it always needs further clarification, as I’m doing here).
As long as you have an enormously powerful global decentralized system, and the riskiest, most dangerous capabilities don’t fit inside a copied shard or a copied subgroup, you’re in relatively good shape (at least in terms of bad guys copying what you’re doing – you still need your own global decentralized governance not to be shit, of course). The whole can be wildly capable while the dangerous capability simply doesn’t fit in anything you can carry away.
Super-additivity is all over our Hyperon AGI design. It’s there in algorithmic chemistry; it’s there in evolutionary programming with EDAs, and etc. etc.. But I’ve been thinking most deeply about one particular locus for super-additivity that seems especially critical — and it has to do with logical reasoning.
One key thing large language models aren’t good at is grounding their reasoning in the real world. Our PLN — Probabilistic Logic Networks — module in Hyperon is designed precisely for that: bridging mathematical and scientific reasoning with the common-sense, everyday stuff about the real world, the social stuff, the perceptual stuff, the linguistic stuff that LLMs are good at. We’re using PLN right now to help our OmegaClaw agents think; and we’re using PLN reasoning in the Rejuve.Bio product to read biological data.
Any time you’re dealing with conceptual, linguistic thinking plus some other body of harder data — software code, or biological data, or financial data — that mix of fuzzy linguistic stuff with hard scientific stuff is exactly where PLN can do better than large language models, or traditional machine learning, or crisp logical reasoning systems.
So one may ask: where does the super-additivity live inside PLN?
Inference control is the crown jewel
Superadditivity resides in many parts of the PLN universe. But one place it lives a lot is inference control.
Inference control is the part of a logic engine that chooses, out of the many, many possible next steps of reasoning, which step to actually take. And that’s the hardest part of making logic-based AI work at scale.
The hard part isn’t making computers do step-by-step logic — computers are good at that. It’s not telling the system what kind of logic to use, for one thing because in our MeTTa language things are so meta that you can code up any kind of logic you want and run most of them reasonably efficiently. The hard part is choosing what inference step to take next. That’s the AGI-scale problem.
And you can’t just write a rule or a heuristic for it. You can’t get a clean, thorough mathematical formula for it. Inference control has to be learned — we’ve known that for decades; we just haven’t had the scalable infrastructure to really experiment with this sort of learning, which has to be at once rule-oriented, ML-oriented statistical and historical.
The conceptual idea is simple: The system does a bunch of slow, dumbass reasoning. It learns what worked and what didn’t from that slow, dumbass reasoning. It pulls some successful patterns out. Then it uses those patterns to guide its next phase of reasoning. Lather, rinse, repeat. And as it does this over and over, it gets smarter and smarter at choosing which inference steps to take.
This process of historical inference-control-pattern learning is inevitably a complex, chaotic process, and as we move toward AGI it will happen over and over in different domains, with complex and beautiful variations. You can learn how to reason in biology. You can learn how to reason in finance. You can learn how to reason about fly genes. You can learn how to reason about love. You can learn how to reason about war. And then, part of the funkadelic quasi-magic is: you can merge what you’ve learned about how to reason in these different domains into overall inference-control heuristics that hold up across domains, at multiple levels of the self-organizing fractal knowledge hierarchy.
This looks to me like a super-interesting nexus for making the super-additivity both decentralized and hard to fork. Because the inference-control heuristics you get from putting together inference-control patterns across many domains are clearly super-additive: you’re distilling abstract heuristics that embody deep knowledge beyond any individual domain … and you can do some math to show why these cross-domain heuristics should be super-additive. A detached copy that only knows inference control learned from reasoning on one little domain just isn’t going to be as clever.
All this is the start of something interesting – but on its own, this doesn’t give you the security we’re after. There’s an obvious objection: why doesn’t some bad guy just copy the learned inference-control heuristics — copy the general cross-domain patterns that pop out of stitching together everything the network has learned?
And that’s what led me to the notion I’ve labeled cryptographic laterality (a term more often seen in the context of hacking, with a slightly different meaning; but it makes as much sense here…)
Cryptographic laterality: intelligence as an event, not an object
This is where the blockchain comes in.
You don’t store the dangerous, high-level, omni-domain inference controller as a single object anywhere. Instead you use cryptographic secret-sharing: you break the inference-control object into little pieces spread across the whole network.
We’re already doing this with the knowledge itself. The Atomspace — the weighted, labeled, dynamic knowledge graph inside the Hyperon engine — can be split out across multiple nodes in the ASI:Chain. You can have bits and pieces of the Atomspace on many, many machines all over the planet, with the thinking happening on many, many machines all over the planet. We have a lot of tooling to support this – ASI:Chain itself, the Distributed Atomspace, the NuNet platform – and our team is hard at work debugging these, scaling them up and integrating them appropriately.
The new step is to do the same thing with the control. You take the omni-domain inference-control patterns learned through history — the high-value inference traces, the policies for merging conclusions from different domains, the critical mid-level atoms that connect abstract patterns with low-level particulars — and you split that across many nodes too. Then you evaluate it through secure multiparty computation (MPC), implemented in ASI:Chain, that reveals to the inference engine only the next needed control action, within certain bounds.
So you use MPC across the nodes of the network to let an inference agent running on one node leverage the knowledge held across all the other nodes, just to figure out what to do next. The overall inference controller is never decrypted and then run. It’s not in plaintext anywhere. There’s no model file to steal, no log directory to copy.
The intelligence is not an object at rest that you can capture. The intelligence is an event that the network, as a totality, performs.
And this completely changes the attacker’s job. If a bad guy wants to fork off and copy the smart thing the global network is doing, their job is no longer copy and run. They have to do something much harder. They have to corrupt a whole big coalition of live nodes. Or they have to distill a moving target through outputs that are being closely monitored. Or they have to rebuild the entire civilization-scale AGI substrate from scratch. None of that is strictly impossible — but it’s vastly harder than it would be if the inference-control policy weren’t distributed across the whole network in an encrypted way like this.
The new piece that became crystal-clear in my head only last week was the secret-sharing bit: using multiparty computation between individual nodes that each embody parts of the overall inference-control policy, where those parts correspond to parts of the overall knowledge base. I’ve been working out a bunch of the math on this – just getting started, there’s a lot more to do, but the direction is clear.
And while I’ve been developing this notion in the context of inference control, it’s clear it can be done for many other AI algorithms too. This is the kind of thing we’ve been grasping toward with SingularityNET from the very beginning — it’s exactly what we had in mind when we made MeTTa serve as both our AGI language of thought and the smart-contract language of our blockchain. It’s exactly the sort of thing putting Hyperon on ASI:Chain was supposed to enable. I’m just starting to see now more of the precise technical ways it pans out.
And here’s one aspect I find particularly beautiful: you don’t buy this security by making the system stupider, and you don’t buy it by building a fortress and burying everything behind opaque central control. You buy it because there are many different ways to do inference control that are all smart — just like there are many different ways for a person to be smart — and we’re deliberately choosing a way to make the system smart that happens to splay out across the entire network in a form that can be secured via multiparty computation.
It’s also important to note we’re not trying to do all the reasoning under strong encryption. Even in MeTTa-IL (the super-efficient compiler that maps MeTTa to on-chain operations) that would be expensive. Encryption has a cost. But in MeTTa-IL you get to choose what runs on-chain and what runs off-chain. So in my proposal, you would do most of the logic and statistical mining off-chain, fast, in the normal way. It’s only the very thin, especially high-value inference-control choice layer that goes on-chain and through MPC. Everything else stays cheap and quick. What you get out the other end of your inference control choices is a critical capability — the capability somebody would need in order to do something genuinely dangerous — that is not a thing you can grab and carry away. It’s a live computation that a whole distributed substrate has to perform.
Just a hint at the details…
Let me give a little hint of the math here, for readers who like this sort of thing but don’t want to real the whole initial draft paper. Everyone else can skip to the next section; scrolling is cheap…!.
The property we actually want. Let W be the live network. For an adversary budget B, a horizon T, and a dangerous task distribution Q_cat, define the extractable capability
where E ranges over everything the adversary can extract or reconstruct on that budget: knowledge snapshots, public code, partial policies, trace corpora, node subsets, distilled controllers, anything runnable. The goal isn’t just the qualitative Cap(W) ≫ Fork(W). It’s the operational condition
where θ_cat is the capability at which a detached system becomes catastrophe-relevant. The whole sits well above the danger line; every stealable part sits below it.
Surface laterality isn’t enough — you want essential laterality. It’s easy to build a derivation that touches many nodes while really depending on just one. So the entropy of shard participation (how many nodes show up in the log, and how evenly) is necessary but not sufficient. The property that actually builds the moat is essential laterality: a derivation τ has essential laterality at least k if removing or hiding any set of fewer than k of its essential participating nodes or provenance domains drops the probability of reproducing the conclusion below threshold. The reasoning genuinely fails if enough of the right live pieces are absent — that’s the property a fork can’t fake.
The MPC control step, and aligning the two structures. At reasoning time, with task and state (q, x_t), live reputational state r_t, provenance/attention state p_t, and the current local states z_S of a participating node set S, a protected control action is
The output a_t is deliberately narrow — route this subquery here, request that premise, expand this rule instance, merge these conclusions with these weights, downweight this source, preserve this contradiction, escalate to a broader quorum, or refuse/sandbox the task. What it never reveals is the full policy or the sensitive trace corpus.
The organizing design principle here is what I’d call cognitive–cryptographic alignment: the access structure Γ(q, x_t) — the set of coalitions authorized to activate a control step — should be derived from the task’s essential laterality. The coalition you need to think the thought should be the coalition you need to cryptographically activate the controller for that thought. A task drawing on biological, chemical, security, and ethical provenance might require a threshold of reputable nodes from each domain, plus independent validators and a governance quorum; a narrow low-risk query needs a trivial Γ or none at all. The quorum isn’t a random committee — it mirrors the genuine cognitive dependency.
The no-plaintext-resting-place claim. If the high-value control policy θ and the sensitive control-relevant trace corpus D* exist only as secret shares ⟨θ⟩_1, …, ⟨θ⟩_n and ⟨D*⟩_1, …, ⟨D*⟩_n, and every authorized use is a threshold evaluation revealing only bounded outputs, then there is no single machine state from which an adversary can copy the whole controller as a portable plaintext artifact. That doesn’t make extraction impossible in every sense — it forces it onto a harder channel: corrupt an authorized coalition, break the crypto assumptions, exploit implementation leakage, or distill behavior through monitored queries.
Distillation is the real remaining attack. MPC stops the direct copy; it does nothing against an adversary who simply watches the network for a long time and trains an imitation π̂. The relevant quantity is the budgeted distillation loss
and the moat is only as strong as this stays large for realistic budgets. So the protected controller has to be a genuinely moving target — depending on fresh, high-dimensional, hard-to-compress live state: current provenance graphs, reputation under distribution shift, node availability, recent contradictions, nonstationary attention flows, task-specific quorum composition. And the query channel itself has to be constrained: bounded outputs, query auditing, rate limits, anomaly detection, canary tasks.
Don’t put the whole mind inside MPC. The architecture has four layers, and only one of them is cryptographically heavy: a public layer (public Atomspace fragments, public PLN/MeTTa rules, ordinary local reasoning), a local-private layer (node-local data, traces, reputational judgments), a thin threshold-control layer (secret-shared trajectory features, routing/merge/escalation/refusal policies, update rules), and a governance-and-audit layer (commitments, selective-disclosure proofs, quorum records, revocation, resharing, human review). Cognition escalates up a ladder as the stakes rise — ordinary local inference for low-risk tasks, small-committee review for moderate cross-node tasks, threshold-protected merge for high-impact lateral tasks, and broad quorum plus governance review for catastrophe-relevant task classes. You don’t spend a supercomputer of cryptography on a trivial question.
Why this is “safety from strategy choice.” Formally, let S*_ε be the set of near-optimal cognitive strategies — those within ε of the best attainable capability. If that set contains both a default portable strategy s_loc and at least one threshold-lateral strategy s_tl with strictly lower forkability, then selecting the lowest-forkability element of S*_ε buys you reduced forkability at a capability cost of at most ε. The safety is paid for out of slack in the strategy landscape — not out of making the system dumb. There are many ways to be smart on a hard problem; we pick one that happens to be hard to steal.
It’s also important to think clearly about where this doesn’t save you. Non-forkability is not non-abusability: a legitimately authorized coalition can still ask a bad question and get a bad answer, and MPC does nothing about that on its own — you still need task screening, rate limits, refusal policies, provenance-aware monitoring, and recourse. The whole thing rests on a real identity and reputation layer; a quorum is no protection if you can manufacture the quorum, so it has to be paired with Sybil resistance and adversarial validation. A single debug log that reconstructs a protected policy in plaintext can quietly destroy the entire story. And there’s a real bootstrap vulnerability: early on, before the protected trajectory corpus has accumulated, the network simply doesn’t have the full non-forkability property yet. The moat deepens only as protected traces, protected updates, and live reputation pile up.
This is what ASI:Chain was designed for – though it doesn’t (quite) solve everything…
I’ve been pushing on decentralized AI since forever– as soon as the Web became a thing in the mid-90s it was clear to me that one could make AI minds decentralized in the same senses that the Internet and Web are, but more so. I first tried to put global distributed computing and strong encryption together in 2001-2002, but concluded the technology just wasn’t there yet to make it practically useful for AI. (A friend and I even talked back then about using strongly encrypted global decentralized processing as a foundation for financial transactions – but we concluded nobody would want a financial transaction network that was as slow and expensive as that was going to be… a failure of imagination regarding market traction, evidently….)
However, I have always tried to be clear that decentralization doesn’t, in itself, fully solve anything. Decentralization just works around pathologies of centralization, it doesn’t intrinsically rule out a whole lot of other pathologies. What it does is open the door to a broad spectrum of creative solutions. Cryptographic laterality as described here, applied to inference control and to other core AI mechanisms, is exactly the kind of solution our decentralized AI infrastructure at SingularityNET/Hyperon has opened the door to. It converts a decentralized AI network’s dangerous capability from a snapshot, a model file, or a log corpus that can be carried away into a live threshold computation that a whole distributed cognitive substrate has to perform.
This is literally the sort of thing ASI:Chain was designed for — the whole point of the dual, synergetic use of MeTTa as both an “AGI language of thought” and a smart-contract language.
Of course all this doesn’t make safe AGI into a purely technical problem. We still need a few other small things, such as, oh, working decentralized human governance during the AGI-to-ASI transition period. But at least this line of development does give a clear direction for filling in the technical components needed to make forking an open, decentralized AI network super difficult.
It’s going to be a lot of fun to work out all the details — and to see it really running .. and maybe even leading us to a beneficial Singularity…. (And what a wildly different feeling it is writing this sort of thing now versus 10 or 30 years ago, with the 2026 sense that AGI might really, actually be just some small integer number of years away… and what we do right now might truly play a critical role in the well-being of our species as this amazing achievement unfolds…)


I like what you're trying to do. I agree decentralization is the better approach but I wonder if your challenge of trying to spread defence across the whole network is spreading too thin. The network is too vulnerable because it's all still acting as one whole.
What if the whole thing was decentralized? Just shift who the AI is serving and focus on the individual. Each person's instance responsible for defending their own data and in turn, sharing clean user experience data for the AI mesh to utilize.
https://anthonykrumm.substack.com/p/the-omnium-gatherum?utm_source=share&utm_medium=android&r=zu4b0
“And this completely changes the attacker’s job. If a bad guy wants to fork off and copy the smart thing the global network is doing, their job is no longer copy and run. They have to do something much harder. They have to corrupt a whole big coalition of live nodes. Or they have to distill a moving target through outputs that are being closely monitored. Or they have to rebuild the entire civilization-scale AGI substrate from scratch. None of that is strictly impossible — but it’s vastly harder than it would be if the inference-control policy weren’t distributed across the whole network in an encrypted way like this.”-article
Life requires parasites