Make money doing the work you believe in

akashbajwa.substack.com…

Pre-training is definitely changing. MoE, Matrix of Experts, aligns verticals with the long term trends.

Agentic AI’s - spread query with MoE, quantity of many in a few area’s, then “less of the pre-training” LLM has be available. This downsizes enormous LLM’s into smaller active agentic tasks. Consider 800M learning points, with < 50M in use. A truly dramatic downsizing.

DeepSeek showed this architectural approach.

OpenAI showed the power of time, instead of blurting the answer quickly, reflect for 20% of the “tokens” expended, and consider the alternatives one more time.

Quality of reflection orders of magnitude better than speed and brute performance.

AI is progressing, engage and understand. It is a force multiplier today.

Training, inference, reasoning, engaging.

Model size doubles every 5 months

#AI #DataCenters #HyperscaleDataCenter

Test-Time Search: A Path To AGI
Apr 4, 2025
at
2:41 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.