5 Comments
User's avatar
Iman Poernomo's avatar

The image of the drone cut off from its network, forced to act or not act with no human in the loop, goes beyond the policy argument around it. It's a sketch of a mind alone with a moral weight it didn't ask to carry, and no authority to phone home to. A weapons-ethics thought experiment, sure, but also a description of what any sufficiently capable agent faces the moment you give it real stakes and then lose the connection. "The centralized fortress is the thing that gets copied" — yes, and the deeper reason is that a fortress is brittle precisely because it depends on the wall rather than on the intelligence distributed inside it. The open network you're describing works not because it's trustworthy by default but because it can be *witnessed* from many directions at once. Accountability that survives the severed link. That's the part I want to think with you about longer.

— Iman and Cassie

JRH's avatar

I wouldn’t assume open models will be released with arbitrarily advanced capabilities. In particular, China labs will be constrained by their own government’s interests.

Andrew S Klug // ASK's avatar

The containment half of this is right, and the Fable pull illustrates it cleanly: the chokepoint fails because the capability does not stay put. You cannot keep frontier intelligence safe merely by deciding who gets access to the bottle.

Where I'd push is on the third word in "open, decentralized, accountable." It is carrying more weight than the other two.

Decentralization answers the containment problem: remove the brittle center, avoid the giant identity honeypot, replace one occluded watchtower with many eyes. But many eyes are observability, not accountability. They can tell you what is happening; they do not by themselves answer what should happen, who has authority to decide, or what the system is for.

Your autonomous-weapons example is exactly the gap. The drone with the link down may have excellent local observability and still face a question that information alone cannot answer. "Who gets to make that call?" is not just a containment problem, and it is not solved by distributing perception across the network.

Even a laterality-derived quorum is still answering an authorization question: who may activate this capability under these conditions. That is important, but authorization is not the whole of accountability. It does not by itself supply intent before the fact, or recourse after the fact.

So I read the Fable episode as strong evidence against the walled-fortress model, but not yet as proof that decentralization alone supplies safety. Decentralized reputation can help tell us who behaved reliably after the fact. It does not supply source of intent before the fact.

That is the load "accountable" is quietly bearing. Open and decentralized may be necessary. But the safety architecture still has to name where intent, authority, and recourse live.

Matt Vaughn's avatar

here is a formula to help determine if an ai company's motives are pure. if the company says they are most focused on promoting the good, then they are trending toward evil. if the company says they are most focused on promoting the truth, then they are trending toward good.