Uttam Dey (@ud30): "Nvidia, AMD, Micron, and all other chip stocks face a major reckoning ahead with the launch of DeepSeek v4. I just finished reading the paper. Here are 3 key takeaways from the DV4 paper and how it might impact chips+AI capex: Deep Focus on KV Cache & Memory Utilization

Make money doing the work you believe in

Nvidia, AMD, Micron, and all other chip stocks face a major reckoning ahead with the launch of DeepSeek v4.

I just finished reading the paper. Here are 3 key takeaways from the DV4 paper and how it might impact chips+AI capex:

Deep Focus on KV Cache & Memory Utilization - the first & immediate takeaway for me. DV4’s KV cache size is just 10% of DV3. That’s offset by a 7-8x increase in the token context window to 1M tokens in DV4. This allows for KV Cache to fit better in HBM and possibly reduce the need to offload to other forms of memory/storage. So I suspect it will keep the demand for HBM tightly bound on the current trajectory.
The Nvidia vs. Huawei debate grows louder after DV4 was validated using Nvidia GPUs and Huawei NPUs. About Nvidia’s GPUs, the best guess is to go with H100/H200 because those are the only ones legally allowed to be exported to China. Huawei and Cambricon already had early access to DV4 while Nvidia and AMD were blocked out until GA—a departure from standard practices of launching leading frontier AI models.
BTW, DV4 is highly efficient and capable at running models on cheaper GPU compute. It requires only 27% of the single-token inference FLOPs compared to DeepSeek-V3.2. That’s basically saying that the model can run 1M tokens efficiently and capably on less expensive (or older) GPU clusters, which could reduce the demand for GPUs with high raw computational power like Nvidia GPUs down the road. That probably explains why Jenen Huang launched so many different products at GTC '26 that focused more on memory (Groq LPUs & Bluefield DPUs) and Interconnect solutions rather than just focusing on the GPU roadmap.

DV4 is possibly the first real-world commercial model to actually focus on KV cache optimization & memory utilization, presenting an entirely new architectural setup that will likely impact how data centers change their scale-up networks. The next 3 months are exciting.

It usually takes ~1-2 weeks after release to observe if there is significant (maybe viral) adoption of DV4. But a more structural adaptation of data center architectures in response to DV4’s model, if it were to happen, will take approximately 1 quarter at least.

Apr 25

9:03 AM

Make money doing the work you believe in

Log in or sign up