Make money doing the work you believe in

Last week The Information reported that xAI’s Colossus-1 in Memphis, Tennessee, achieves a mere 11% MFU (Model Flop Utilization), compared to 45-55% other hyperscalers achieve.

My latest article is a comprehensive analysis of:

  • What really is MFU and why is it hard to achieve, and where 50% of the FLOPs are lost — Communication bottlenecks, silent data corruption (SDC), and stragglers

  • Lessons from Google, Meta, ByteDance, and DeepSeek

  • Who does it best, and

  • The uncomfortable truth behind deploying the latest NVIDIA GPUs.

Enjoy!

The Uncomfortable Truth Behind Deploying the Latest NVIDIA GPUs: MFU, Silent Data Corruption, and the Real Moat for CSPs and Hyperscalers
May 10
at
10:41 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.