Make money doing the work you believe in

Instead of storing a perfect record of every token you've ever seen, an SSM compresses the entire history of the conversation into a single, fixed-size mathematical box (a state vector). When a new token arrives, the model updates the state vector.

Pro: O(1) memory during generation. Theoretically the perfect edge architecture. 

What it costs you: Precision. Because the state is a fixed size, it is a lossy compressor. Gets especially messy during quantization.

Mar 20
at
5:23 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.