Tensor Logic for Bridging Neural and Symbolic AI

Dec 16, 2025

How we’re gearing up to use tensor logic to help symbolic-logical-inference and neural-nets-on-GPUs synergize super-tightly together within the Hyperon AGI project

The Integration Problem

While my optimisic views on AGI are certainly less eccentric than they were a decade ago (let alone in the 70s and 80s when I started thinking about and working on this stuff) — i.e. the world is finally catching up to me, at least a bit and in some ways! — my perspective is still out-of-the-mainstream in multiple respects. For one thing,

I do suspect we may be just a few years from human level AGI
However, I don’t think LLMs are going to get us there, nor is any architecture with LLMs at the center

I do think LLMs can play a major role as components of AGI systems, but if they’re going to, I suspect they will need to be tightly coupled with other sorts of AI components. There may be many ways to do this — including purely neural systems with a variety of neural sub-architectures, or hybrid neural-symbolic-evolutionary systems like Hyperon (my main bet and approach to AGI right now).

Hybrid AGI architectures can be ad hoc hacks but they totally don’t have to be: Hyperon design is based on principled mathematical and cog-sci analysis of the different sorts of memory and processing needed for HLAGI, and then specific choices regarding what algorithms and representations look most promising for implementing each of the aspects involved.

Anyway, for those of us interested in neural-symbolic and other hybrid AI architectures, there are various “integration” headaches that pop up again and again — all rooted in the fact that symbolic reasoning systems and neural networks speak fundamentally different languages at the hardware level.

Symbolic systems love discrete structures—graphs, trees, logical formulas, pattern matching. Neural networks love dense matrices and tensors that can be blasted through GPU cores in parallel. When you try to build systems that combine both, you typically end up with awkward translation layers, serialization bottlenecks, and the constant feeling that you’re leaving performance on the table.

We’ve been wrestling with this problem in the Hyperon project as we scale up our PRIMUS cognitive architecture prototypes to run on our new scalable Hyperon infrastructure.

Among the good news: we now have MORK (our high-performance metagraph database) running scalably, and two MeTTa interpreters—PeTTa and MM2—that can handle serious workloads.

Among the challenges: how do we get our logical reasoning, pattern mining, and probabilistic inference to efficiently interoperate with deep neural networks and other tensor-heavy algorithms, especially when we want to leverage GPU acceleration?

One potential answer we’re exploring is to adopt and extend Pedro Domingos’s tensor logic as an intermediate representation—a mathematical lingua franca that lets symbolic and tensorial computation meet on common ground.

In this post I’ll explain what we’ve thinking and working on — with links at the end to some early draft papers giving details. The caveat being: This is stuff we’re working on right now and it looks very promising but is not yet proven. But it feels so interesting and exciting I’m motivated to share it anyway ;) … and will keep you posted as the work progresses!

The Core Insight: Logic Is Linear Algebra

The fundamental observation behind tensor logic is surprisingly simple and elegant: logical databases are sparse tensors, and logical rules are tensor contractions.

This has been “known all along” as folklore, but Domingos’s Tensor Logic draws out the implications in a clear, beautiful and useful way.

Consider a simple logical rule: “If X is a parent of Y, and Y is a parent of Z, then X is a grandparent of Z.” In Datalog notation:

Grandparent(x, z) :- Parent(x, y), Parent(y, z).

Now think about representing the Parent relation as a matrix P where P[x,y] = 1 if x is a parent of y, and 0 otherwise. The grandparent relation becomes:

G[x, z] = H( sum over y of: P[x, y] * P[y, z] )

where H is a step function that outputs 1 if the sum is positive, 0 otherwise. This is just matrix multiplication followed by a threshold—exactly the kind of operation GPUs excel at.

This isn’t merely a notational trick. It means we can:

Store logical facts as sparse Boolean tensors
Express inference rules as Einstein summations (einsum operations)
Execute logical reasoning using the same GPU kernels that power deep learning
Seamlessly mix neural computations with logical inference in the same computational graph

Pedro Domingos and collaborators have been recently developing this perspective, sometimes positioning tensor logic as a unified framework for all of AI. We’re more modest in our own intended uses for the methodology—we see it primarily as interfacing magic between symbolic and tensorial AI tools in hybrid architectures—but the mathematical foundation is genuinely powerful.

Beyond Boolean Logic: Handling Uncertainty

In thinking through these potential applications of tensor logic, my colleagues and I have also been looking at how to extend it to make it even more useful — by augmenting it with understanding of uncertainty and compute resources.

Pure Boolean tensor logic is clean and fast, but real-world reasoning involves uncertainty. When we say “Alice might know Charlie through their social connections,” we want to propagate probability estimates, confidence intervals, or other uncertainty measures through our inference chains.

Toward this end I’ve introduced a Resource-Aware Probabilistic Tensor Logic (RAPTL) framework, extending the basic tensor logic concept a bit. The key innovation is extending tensor logic with a triple product quantale:

Q = Q-logic × Q-uncertainty × Q-resource

Every piece of information now carries three components:

What it represents (the logical/structural content)
How certain we are about it (probability, confidence interval, PLN truth value, etc.)
What resources it needs (GPU memory, compute, bandwidth)

These three aspects travel together through all operations. When we compose two facts, we combine their logical content (conjunction), their uncertainties (using appropriate probability rules), and their resource requirements (summing for sequential operations, taking maxima for parallel ones).

Pluggable Uncertainty Models

Different applications need different uncertainty representations. Medical diagnosis might use probability intervals (”60-80% chance of condition X”). A game AI might use simple point probabilities. Our PLN (Probabilistic Logic Networks) system uses strength-confidence pairs that track both how true something seems and how much evidence supports that estimate.

Rather than hardcoding one approach, RAPTL defines an abstract uncertainty interface:

trait UncertaintyValue {
    type T
    def combine_conjunctive(other: T): T   // Both things are true
    def combine_disjunctive(other: T): T   // At least one is true  
    def negate(): T                        // Opposite is true
    def marginalize(dim: Index): T         // Aggregate over possibilities
}

Concrete implementations—point probabilities, intervals, PLN truth values—plug into this interface, and the rest of the machinery works uniformly.

Resource Awareness: Making It Run Fast

To make things even more interesting, it’s worth considering that GPUs have complex memory hierarchies: huge but slow HBM (high-bandwidth memory), smaller but faster L2 cache, tiny but blazing-fast shared memory. The same computation can run 100× faster if data fits in shared memory versus streaming from HBM.

RAPTL tracks resource profiles alongside logical content:

r = [HBM-bytes, L2-bytes, SMEM-bytes, registers,
     FLOPs, bandwidth, NVLink, launch-overhead,
     nnz, density, format, rank]

This isn’t just bookkeeping—it enables certified rewrite rules that transform computations while guaranteeing both semantic preservation and performance improvement. Each optimization follows a guard-transform pattern:

IF:   Guard(X) is satisfied
AND:  CostModel(X) shows improvement
AND:  Accuracy loss ≤ threshold ε
THEN: Transform X → T(X)

In other words: “If conditions are met, and it’s profitable, and accuracy loss is acceptable, then apply the transformation.”

Key Transformations

Among the technical transformations one needs behind the scenes to make this sort of thing work, we are looking at ideas like:

Sparse-to-Dense via Factorization: Many sparse matrices are approximately low-rank. A 1000×1000 sparse matrix with rank 10 can be stored as ~20,000 numbers instead of 1,000,000, and computed using dense operations that GPUs love.

Format Mutation: Different sparse formats excel at different access patterns. CSR for row-wise access, ELL when rows have similar densities, BCSR for block structure. The framework automatically selects formats based on data statistics.

Cache-Aware Tiling: Break computations into tiles that fit in shared memory. A 100× speedup is possible when you can avoid round-trips to HBM.

Multi-GPU Sharding: For data too large for one GPU, partition to minimize cross-device communication while keeping “halos” of boundary data for correctness.

Linear Logic for Memory Safety

Digging deeper into resource-awareness, there’s a subtle but critical issue with GPU programming: if two kernels write to the same memory simultaneously, you get data corruption. We need compile-time guarantees about memory ownership.

With this in mind, RAPTL incorporates linear logic modalities to regulate tensor copying and caching:

Linear (A ⊸ B): Use exactly once then delete—like a concert ticket
Affine (A → B): Use at most once—like a coupon
Bang (!A): Read-only, share freely—like a library book
With (A & B): Choose one path—like a fork in the road

These modalities compile to practical mechanisms: move semantics, reference counting, immutable shared buffers, and GPU event synchronization. The type system catches memory bugs at compile time rather than producing silent corruption at runtime.

Connecting to MORK via ShardZipper

How does all this connect to the MORK metagraph database (the core in-RAM knowledge infrastructure we use for Hyperon)? MORK stores knowledge in a highly optimized prefix trie (PathMap) structure that’s great for symbolic operations but not directly GPU-friendly. GPUs prefer small, contiguous arrays, not pointer-heavy tree structures.

Our ShardZipper approach, currently in the early stages of implementation, bridges this gap:

Partition: Split the trie by hashed prefixes until each shard’s size is manageable
Capture: Detach a shard and record a “zipper” (continuation) that knows how to splice it back
Materialize: Convert the shard into contiguous arrays—structure-of-arrays for indices, value arrays, label masks
Compute: Run GPU kernels (joins, projections, scoring) and emit compact “patch” records
Reattach: Apply patches and use the zipper to reintegrate in O(1) time

This workflow lets us execute tensor logic operations on GPU while maintaining MORK’s advantages for symbolic operations. The same knowledge graph supports both pattern matching (MORK’s strength) and matrix multiplication (GPU’s strength).

Semiring Flexibility

Different reasoning tasks need different algebraic structures:

Boolean (OR, AND): Reachability—does any path exist?
Counting (+, ×): How many paths exist?
Viterbi (max, +): What’s the best (highest-score) path?
Probabilistic: What’s the expected value under uncertainty?

Tensor logic handles all of these uniformly by parameterizing over the underlying semiring. The same infrastructure supports deterministic logic, probabilistic inference, and optimization.

Example: Hierarchical Resolution Transformers

To make this concrete, consider implementing a Hierarchical Resolution Transformer (HRT) on MORK using these techniques.

HRT constructs a multi-resolution pyramid of representations, with exponentially reduced sequence lengths at each level. Adjacent levels exchange information via cross-resolution attention.

In our framework:

Nodes represent resolution levels, token positions, and attention parameters
Edges encode the down-projection relationships between scales and cross-resolution attention patterns
Shards are organized by (sequence, layer, resolution, token-block)
Kernels implement self-attention, cross-attention, and gated fusion as tensor operations

The tensor logic formulation makes it natural to:

Express attention as sparse-to-dense transformations when patterns are low-rank
Track resource requirements to optimize tiling and caching
Mix learned (neural) and structured (logical) components in the same graph
Potentially replace backpropagation with local predictive coding updates—more on this in future posts

Quantifier Handling: Beyond Simple Joins

Real reasoning involves quantifiers—”for all X” and “there exists Y”—in complex patterns. Consider: “Every job posting has at least one required skill such that all candidates with that skill meet every requirement for the job.”

For all jobs j,
  there exists a skill s,
    such that for all candidates c,
      for all requirements r of job j:
        IF RequiresSkill(j,s) AND HasSkill(c,s) 
        THEN MeetsReq(c,r,j)

This alternating quantifier pattern (forall-exists-forall-forall) is expensive naively—potentially billions of operations. But tensor logic gives us optimization handles:

Early pruning: If a job has no required skills, fail fast
Skill clustering: Group similar skills to reduce search space
Incremental verification: Check high-confidence facts first
Sparse-to-dense transformation: Factor the sparse skill matrix for GPU efficiency

How This Fits Our Larger Architecture

For those following Hyperon development, here’s how tensor logic fits the bigger picture:

MeTTa remains our high-level language for expressing cognitive algorithms—pattern matching, rule firing, attention allocation, inference control.

MeTTa-IL (coming early next year) will provide a compiled intermediate representation with resource management via the rho calculus.

Tensor logic serves as the bridge between MeTTa/symbolic operations and GPU-accelerated dense computation. It’s not a replacement for either world—it’s the translation layer that lets them cooperate efficiently.

RAPTL extends tensor logic with the uncertainty handling we need for PLN and other probabilistic components, plus the resource awareness that connects to our broader infrastructure for capability-controlled, auditable execution.

Tensor Logic in the Hybrid-Architecture Approach to AGI

Pedro Domingos has positioned tensor logic as potentially solving AI by unifying all the fragmented approaches. We’re more conservative. Tensor logic is tremendously useful as an interface technology—it lets symbolic and neural components share computation graphs and memory layouts in ways that were previously impractical.

But we don’t think any single representation or algorithm solves AGI. Our PRIMUS architecture involves probabilistic logic, evolutionary program learning, attention allocation, motivational systems, and more—each contributing capabilities the others lack. Tensor logic helps these pieces work together efficiently; it doesn’t replace the need for multiple complementary approaches.

Dig Into More Details

A few early rough-draft documents are available for those who want the technical details:

“RAPTL: Resource-Aware Probabilistic Tensor Logic” covers the full RAPTL framework with uncertainty quantification and linear logic modalities
“RAPTL on ShardZipper: Neural-Symbolic At Scale on GPU and CPU” provides practical guidance on the ShardZipper approach and includes a worked HRT example
“ShardZipper: Efficient Execution of Dense Kernels on MORK”

These are fairly raw and will be refined and arxiv-ed and published as things progress, I’m sharing them now in a spirit of open-science / open-source collaborativeness.

We’re actively implementing these ideas as we scale up PRIMUS components on the Hyperon infrastructure. If you’re working on hybrid neural-symbolic systems and wrestling with similar integration challenges, my colleagues and I would also love to hear from you.

Ben Goertzel leads the Hyperon AGI project and serves as CEO of the Artificial Superintelligence Alliance. The tensor logic work described here is part of ongoing efforts to build scalable, transparent AGI systems.

Neural Foundry

This is a brillant approach to the neural-symbolic integration problem. The insight that logical inference is basically tensor contraction is elegent and solves the serialization bottleneck in a way that feels inevitable once you see it. I've watched so many hybrid systems struggle with the translation layers between symbolic and neural components, and RAPTL's resource tracking alongside uncertainty makes this way more practical than just treating tensor logic as a theoretical exercise.

Expand full comment

Dinis Guarda

Great insights and visionary blood Ben! we will have in a few years human level AGI but I believe we will have multiple iterations of small AGIs and eventually a major one... The challenge is if it is open or centralised and if comes from a more humanistic school or military autocratic creation base. The lines are blurring and the important thing is how we deal with it now!

3 more comments...

Eurykosmotron

Discussion about this post

Ready for more?