Demand is booming. Zhipu, for example, is serving 5.5 trillion tokens per day. Developers are rushing to the platform, joining at about ten a minute. Keeping up with inference loads is a struggle in China, as it is in the US.
Teams universally acknowledged compute constraints, particularly the shortage of Nvidia chips. But that isn’t stopping innovation under those constraints.
Researchers and developers are free to use whichever model they want. Anthropic’s Claude was the preferred choice for technical teams, but it was clear that “dog-fooding” is commonplace.
We’ll write something in more depth in the next couple of weeks.
May 5
at
5:00 PM
Relevant people
Log in or sign up
Join the most interesting and insightful discussions.