There's some confusion about the "wall" that many AI people are talking about. First, it's not an AI winter—far from it. AI progress is rapid: we're seeing breakthroughs in music generation, image creation, text-to-speech, video generation, protein folding, and more. We're in a golden age of AI.

The wall isn't about LLMs like Gemini, Claude, or ChatGPT stagnating. These models will continue to improve—that's not in question.

The wall is about diminishing returns in scaling LLM pre-training. From GPT-1 to GPT-4, we've benefited from scaling model size and dataset size together (Chinchilla scaling laws). But now, we've exhausted most of the available internet text data. We can't keep scaling both in the same way. We can still train models longer by rerepeating the data, but gains diminish after a few epochs. That's the current reality—we're hitting limits in pre-training.

So, what's next? If pre-training is slowing down, maybe post-training is about to become more important than ever.

Nov 23
at
5:50 PM