While it may not be the case, it could benefit from further development in that direction.
We already have multiple CPUs with private memory: L1 caches are private to each core in a multi-core CPU.
The challenge, however, is that sharing and synchronization are performed at an overly fine level, necessitating complex cache coherency algorithms.
Currently, data is passed between cores on a variable/memory-cell level, which can interrupt any processor while global memory is being updated.
I advocate using a narrower waist. The only way to share values between isolated processes would be to enqueue them in a more constrained manner, i.e., atomic, asynchronous insertion into queues. This mental model can be likened to an Arduino sending a packet of information to an ESP32 over a thin wire. We already know how to accomplish this type of task (“networking”), but we were constrained by 20th-century hardware to implement something more convoluted and synchronous within programming languages, i.e., functional calling instead of message passing.
We have been subjected to false advertising, such as the notion that “C” is a “low-level language.” Due to this misconception, we became enamoured with functional, stack-based calling conventions and overlooked other very different ways of solving sub-problems. A glaring example is “forth,” which does not implicitly use the C calling convention. Another example is Prolog (its syntax only resembles “functions,” but it is not). I am not certain that Green Arrays’ forth is “the” answer, but it certainly differs from what we consider “programming.” GA144 - youtube.com/watch?v=0Pc…. I believe we are discouraged from considering this and other approaches due to our devotion to synchronous, sequential, functional programming.
This can be accomplished easily with the software we already have, without any hardware modifications. This does not require new technology, only a shift in perspective.