Since o1, and especially since DeepSeek-R1, improved "reasoning" has basically become the standard for new LLMs. Speaking of which, Gemini 2.5 Pro just came out as the latest reasoning model offering, which ends up at the top of most benchmarks (notably Humanity's Last Exam).
However, beyond just improving the reasoning abilities of LLMs…