Make money doing the work you believe in

In 1958, Oliver Selfridge described an architecture he called Pandemonium, after Milton's poem where demons shout in council. Pattern recognition, he proposed, might work the same way: multiple specialized "demons," each trained on a different feature, all shrieking at once, with a higher-level demon hearing the loudest and forwarding the verdict.

Modern Mixture-of-Experts architectures are pandemonium with better marketing. Multiple specialized expert networks process the same input. A gating mechanism routes to the most relevant experts. The experts vote, the gate decides, the output gets forwarded.

The structural identity is striking enough that it is strange almost no MoE paper cites Selfridge. The 1958 architecture solved feature recognition for hand-printed letters and was set aside. The 2020s rediscovered the same architecture for a different problem and called it new.

Selfridge's deeper insight, which MoE papers do not name: intelligence may be parliamentary rather than monarchical. Many specialized processors disagreeing and resolving disagreement may be the natural shape of cognition. Sixty-eight years on, the field still has not absorbed it.

May 7
at
5:25 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.