Discussion about this post

User's avatar
Alex Tolley's avatar

Asimoc anticipated the problem in his robot novel, "The Naked Sun". How to make a robot that cannot break the first law of robotics: "Do not harm a human being", kill a person. It was done by giving robots different, innocuous commands that, in aggregate, did kill a human.

We know from human security systems that even security in depth can only do so much. Whilst one might want to drive down the probabolity to very low values, there are limits. As we have seen with even simple things, like toy chemistry sets, over my lifetime they are dumbed down to absurdly low levels to prevent any harms, pretty much destroying the value of such a product for education.

The problem will manifest itself in any number of domains. Producing weaponized bio agents. One might have to restrict access to DNS databases, PCR chemicals and equipment, and so on. It can be done, but it will wreck the ability of non-malicious actor s from doing any interesting research.

Criminals will always find a way to bypass controls. Authorities can only try to lock down access ever more tightly. But at some point, it cripples the value by trying to protect against the bad. We have examples of that from the reduction of nuclear weapons. Fissile material was diverted or stolen. Bombs can be designed with available information and knowledge.

By all means, find ways to try to defend against jailbroken AI, but I suspect it not only cannot be done, but that the systems designed to do so will prove more of a burden on the target.

Nicholas j Bogaert's avatar

Also, this is exactly the direction the field has to move.

LLMs are not the whole intelligence stack. They are the fluent generation layer. The real safety problem is what happens around them: memory, provenance, authorization, tool control, drift monitoring, audit trails, and a gate that fails closed when capability exceeds trust.

That is why the “wall” approach was never going to be enough. You do not solve AGI safety by locking the model in a tower or pretending dangerous capability can be erased from the world. You solve it by building lawful boundaries around release, execution, identity, and memory.

AI.Web has been working from that premise for a while now: local compute, local memory, local identity, symbolic verification, trace-level monitoring, and what we call the Genesis Node.

My view is simple: the AGI-level mind should not live loose inside a moving robot body. The robot should be an interface. The deeper intelligence should stay in a stable sovereign node, owned by the user, auditable, gated, and grounded. The body requests. The node governs. Pandora’s box stays on the table.

For anyone trying to understand the larger frame without chasing ten scattered posts, I laid it out here:

https://nicholasjbogaert.substack.com/p/the-wall-was-never-the-plan?utm_source=share&utm_medium=android&r=5zf6op

No accusations. Ideas converge. But some of us have already been building the box the rest of the field is starting to describe.

3 more comments...

No posts

Ready for more?