HalfHumanDraft Subscribe
44 posts 13 dashboards

What If Memory Were Math?

What If Memory Were Math?
Framework Seed / Concept Paper

What If Memory Were Math?

A proposal for temporal computing, where data isn't stored and retrieved. It's computed from a function of time at the point of use. No bus. No fetch. No bottleneck. Just math.

I want to be straightforward: this is an idea I believe in, not something I've proven. The math works. The logic holds up under pressure-testing. But none of this has been benchmarked, prototyped, or validated against real hardware. I'm a framework generator, not a chip designer. My role is to raise this up clearly enough that someone who is a chip designer can look at it and know whether it's worth pursuing. If this sparks something for someone smarter than me, that's the whole point.

The problem everyone works around

Every modern computer is built on the von Neumann architecture. The processor does the thinking. Memory holds the data. A bus connects them. Every time the processor needs data, it sends a request through the bus, waits, receives the data, and then computes.

The bus is the bottleneck. It has been since 1945. Everything we've built since then (caches, prefetching, branch prediction, memory hierarchies, DMA, NUMA) is a workaround for the same fundamental constraint: data and computation live in different places, and moving between them takes time.

Processor
Computes
Memory
Stores

A modern CPU can execute billions of operations per second, but a single memory fetch takes hundreds of clock cycles. The processor is fast. The bus is slow. The gap widens every hardware generation.

What if you eliminated the bus entirely? Not by making it faster. By making it unnecessary.

The core insight

Instead of storing a bit in memory and fetching it when needed, compute the bit's value as a mathematical function of the current time.

Every processing core holds a small set of parameters. At any given clock cycle, the core evaluates a function and produces a bit. That bit was never stored anywhere. It was calculated on the spot, from math.

But a core isn't limited to one function. You can layer multiple computations across different cycles. At cycle 0, the core evaluates function A. At cycle 1, function B. At cycle 6, you inject function C. The core's total period is determined by how many distinct computations it carries. One function = repeats every period. Thirty-five functions = repeats every 35 cycles. Within that window, every state is deterministic.

Live: multi-function temporal core
This core starts with a base function. Add more functions at specific cycles to build complex output patterns. The total period equals the number of functions. Every output is deterministic and computed fresh.
0
Function layers (each defines output at one cycle position):
Add at position: value:
0
Period: 1 cycles | Output:

The key property: the state at any cycle is deterministic. You don't need to run cycles 0 through 999 to know what happens at cycle 1,000. You just evaluate the function at 1,000. O(1) lookup into any point in time.

And here's where it opens up: because the cycle position determines which function evaluates, you can build conditional logic into the cycle structure itself. If cycle % 7 == 3, do X; otherwise do Y. The cycles become a programming language. Branching, looping, conditional output, all expressed as deterministic functions of time. No stored instruction pointer. No program counter fetched from memory. The passage of time is the execution.

"But real data isn't a wave pattern"

Square waves and phase offsets are fine for signal processing, but what about a user typing their name? That's arbitrary data with no mathematical structure.

But a constant is a valid temporal function. The function f(cycle) = 1 outputs 1 every cycle. f(cycle) = 0 outputs 0 every cycle. Between those two, you can encode any binary data.

The critical thing to understand: these bits are not passively sitting in storage. They are being actively computed and output every single cycle. The grid at cycle N is the data, produced fresh by every core evaluating its function at that moment. Another core can "read" that data by sampling the grid at cycle N. The cycle number is the address. The grid state at that cycle is the value. That's the memory replacement mechanism.

So when you type "HELLO" and it fills the grid, what's actually happening is: each core has been given a constant function (1 or 0) that it evaluates at this frame's cycle. At the next cycle, those same cores might evaluate a completely different set of constants (a different frame), producing different data. The cores aren't storing "HELLO." They're computing the bits of "HELLO" on demand, at the cycle when it's needed, and computing something else at every other cycle.

Live: data as temporal computation
Type anything. Each character becomes 8 bits actively computed on the grid. These bits aren't stored. They're produced fresh at every cycle this frame is active. Another core samples this grid at this cycle to "read" the data. The cycle is the address.
256-bit grid (actively computing):
Every bit = f(cycle) evaluated right now
ASCII breakdown:

This means the temporal architecture handles any data. A database record is a frame. An image is a (very large) frame. User input gets loaded as constant functions for a specific cycle window. The data was never "stored." It was loaded as parameters once, then computed locally from that point forward. Every subsequent "read" is a local evaluation. No bus. No fetch.

What this does to the bus

In a traditional system, if a piece of data gets read 1,000 times, that's 1,000 trips across the memory bus. In the temporal model, it's 1 parameter load + 999 local evaluations.

Traditional: 1,000 reads
CPU
RAM
Every read = bus traversal. ~100ns each. 1,000 reads = 100,000ns waiting.
Temporal: 1 load + 999 local evals
Core
Params
Parameters loaded once. Every read = local eval. ~1 cycle. No bus.

For read-heavy workloads (database queries, neural network inference, image processing), this is a fundamental improvement. The data exists at the point of computation. There's nothing to fetch because there's nowhere else for it to be.

Computation without communication

When cores depend on each other, traditional systems write results to memory for other cores to read. Two bus traversals per handoff. In the temporal model, core A produces output at cycle N. Core B samples A's output at cycle N-1. No write, no bus, no fetch. Data flows through time offsets, not physical interconnects.

Live: temporal pipeline
Core A produces a bit. Core B reads A's previous-cycle output and inverts it. Core C reads B's previous-cycle output. Data propagates through time, not space.
0
A
0
Source
B = !A[n-1]
1
Invert
C = B[n-1]
0
Pass

The system that can't stay broken

A temporal system recomputes state from scratch every cycle. A glitch at cycle 47 self-corrects at cycle 48 because the core recalculates from parameters, not from stored state. There is no stored state to corrupt.

Live: corruption comparison
Both run the same 8-bit pattern. "Corrupt" flips random bits in the traditional system. It stays broken. The temporal system self-corrects every cycle.
0
Traditional (stored)
Temporal (computed)
Match
100%

This also eliminates cache coherency. In multi-core systems, modifying data that another core has cached triggers expensive coherency protocols. In the temporal model, there are no copies. Change a core's parameters, and every subsequent evaluation reflects the change. No copies to invalidate. No stale data. No coherency protocol.

Temporal RAM: the 10% idea

A practical first step: dedicate 90% of a GPU's cores to active computation, reserve 10% as "temporal RAM." These RAM cores hold constant functions representing stored data. A "read" is sampling a core's output. A "write" is updating that core's parameters. The compute cores and RAM cores are on the same die, synchronized to the same clock. Reads are local operations within the chip fabric. No external bus.

What this gives you: A memory system physically co-located with computation at the individual core level, with zero bus traversal for reads. This is processing-in-memory research approached from the temporal direction.

Rethinking the boot sequence

Traditional boot: BIOS loads from ROM, initializes RAM, loads the OS from disk into RAM, CPU starts fetching instructions through the bus. Every step is storage-to-bus-to-processor.

Temporal boot: parameter definitions load from M.2 into cores once. The cores start cycling. The OS is the temporal pattern. It doesn't "run on" the hardware; it is the hardware's state evolution. After the initial parameter load, the bus goes quiet. Everything runs locally.

Determinism without repetition

A common first reaction: "So it just repeats the same pattern forever?" No. Deterministic does not mean simple or obvious.

A core with 35 function layers has a period of 35 cycles. Within those 35 cycles, the output can be completely arbitrary: any combination of 1s and 0s, in any order, with any internal logic. From the outside, it might look random over short windows. But it's fully deterministic. You can compute the output at cycle 10,000 without running cycles 0 through 9,999. You just evaluate: 10000 % 35 = position 25, look up the function at position 25, and you have your answer.

Now layer conditional logic on top. A core's function at cycle N can depend on N % 7, or floor(N / 100), or any other derivable property of the cycle count. This creates branching behavior, phase transitions, and complex long-period sequences, all still deterministic, all still O(1) computable at any point.

When you're dealing with math, math is the only limitation. And math is not a very limiting thing. Conditional cycle structures, nested periodicities, function composition, cross-core dependencies via time offsets: these are all valid temporal operations. The cycles aren't just a clock. They're an execution environment. The passage of time is the program running.

Something being determinable doesn't mean it's immediately evident. You could have a 35-parameter equation that produces a sequence so complex it looks chaotic over short intervals. But at some point it repeats. At some point it ends. And you can determine what happens at each step the next time it runs. The complexity lives in the parameters, not in stored state. And parameters are loaded once.

What I don't know

Where the proven ground ends and the speculation begins:

Write-heavy workloads
Everything above apppears to work beautifully for data written once and read many times. For workloads with constant new data (network routing, real-time sensor streams), the parameter injection path becomes the constraint. Dedicated injection-routing cores help, but I haven't proven the approach scales to millions of writes per second.
GPU timing precision
The architecture assumes deterministic cycle-level precision. Real GPU scheduling involves warp divergence and non-deterministic execution order. Whether existing CUDA hardware can provide the timing guarantees this model requires is an open engineering question.
Parameter injection bandwidth
Loading new data means updating core parameters. The speed of that propagation determines whether the hybrid architecture works for interactive workloads.
Comparison to existing approaches
Processing-in-memory, near-data computing, and computational storage all attack the von Neumann bottleneck from different angles. The temporal approach may converge with, complement, or be superseded by existing work. That determination requires literature review I haven't done.
Economic viability
Would performance benefits justify the cost of new paradigms, toolchains, and hardware? That's a question for people who build and ship silicon.

Where this goes from here

I'm a framework generator. I find patterns, pressure-test them, write them up clearly enough that someone else can take them further. I don't build chips. I don't have a lab.

What I have is an idea that I believe holds up: if you replace memory with math, you have the potential to eliminate the bus. If you eliminate the bus, you remove the fundamental constraint that's limiting classical computing architecture. And when your data, your computation, and your communication are all just functions of time evaluated locally, the distinction between memory, processing, and interconnect dissolves. They become the same thing.

The next step is empirical. Pick one narrow workload. Implement it both traditionally and temporally on the same GPU. Measure. The math says it should win on read-heavy, bandwidth-bound problems. A benchmark would prove it, or show its holes.

If you're someone with the expertise to run that benchmark, or the perspective to see where this breaks, I'd genuinely like to hear from you. The whole point of releasing frameworks is that someone else takes pride in developing them.

Interactive model: Want to explore the mechanics hands-on? The Temporal Computing Dashboard lets you build frame sequences, step through cycles, predict future states, and corrupt systems to see self-healing in action.