The app for independent voices

A couple weeks ago I went viral for say that in China, power for AI is a "solved problem," and not only in generation but transmission as well. That contrasts with Silicon Valley, where the debate is often whether there is enough electricity at all.

Elon Musk agrees. He noted just a few days ago that Chinese companies will be the toughest competitors because "they have more electricity than America," and he has been pretty consistent about electricity being a hard limiter for AI.

But having lots of generation and strong high-voltage lines does not mean the whole AI infrastructure problem is solved. So I dug into a Huawei white paper on AI data centers. It is dense but clear, and the main ideas are intuitive if you sit with them. I am not claiming deep expertise, so experts are welcome to weigh in.

What matters beyond “having power”:

1/ Size and density: Modern AI clusters are both huge and compact. Racks are moving from 20–50 kW to 100 kW or more. Entire sites can need 200 to 500 MW of steady supply. The challenge is not only producing electricity, but safely delivering that much power into one campus and one hall.

2/ Cooling: All that power becomes heat. Traditional air cooling will not keep up. Liquid cooling and high-efficiency layouts are becoming standard. Keep PUE at about 1.15 or better, or costs and stability degrade fast.

3/ Networking: More GPUs do not guarantee faster AI. If the interconnect cannot feed them, performance stalls. Fabrics are shifting from 200G to 400G and 800G, with smarter scheduling and topology to keep large clusters saturated.

4/ Operations: At this scale, reliability depends on automation. Sites need continuous telemetry, anomaly detection, and self-healing so common faults resolve in minutes, not hours.

5/ Real-world performance: Training gets the headlines, but inference drives user experience and cost. Latency, accuracy, concurrency, and energy per request vary by workload. A data center tuned for 30 ms recommendations is not the same as one tuned for 200 ms voice.

China’s edge in generation and transmission is real, and it lowers a major barrier. The remaining work is practical: deliver extreme power density to the rack, remove the heat, wire the cluster so GPUs are not starved, run the plant with automation, and match the build to the workload. Those are some more factors that will decide who builds the best AI infrastructure. Abundant electricity is necessary to even play, but it is not sufficient to win.

-Rui

Sep 11
at
5:32 AM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.