This is an extremely undervalued competitor operating in the same space as
$SMCI
The greatest edge Penguin has is as follows:
"In March 2026, Penguin launched the MemoryAI CXL-based KV Cache server, which the company describes as the industry’s first production-ready server built on CXL memory disaggregation architecture. This product directly addresses what engineers call the Memory Wall in large language model inference: the bottleneck that occurs when the KV Cache of a large model running inference exceeds the HBM memory attached to the GPU.
The MemoryAI server solves this problem by providing 11 terabytes of CXL-attached memory that the inference workload can address as a seamless extension of the GPU memory pool. The GPU does not need to spill KV Cache to slower storage tiers during inference. The entire context window of even the largest current models fits within the CXL memory pool. The result is inference performance that significantly exceeds what the same GPU configuration can achieve with standard memory configurations."