I believe at this stage one should be extremely cautious about company demos. The proof/counterexample was obtained by an undisclosed custom internal model that seems to have been specifically tailored/fine tuned to the problem by two future fields medalist caliber mathematicians Mark Sellke and Mehtaab Sawhney with the help of another CS prodigy Lijie Chen. They probably iteratively fine tuned the scaffolding, model, context/RAG and verifier and biased it towards promising strategies (all of which can then conveniently be hidden behind the undisclosed internal model) and then OpenAI claimed an autonomous result when the model after having a 125 page random walk (and that’s just a summary) was able to tumble over the finishing line under the watchful eyes of expert mathematicians who then picked up the output, checked it and turned it into an actual proof.
The fact that nowadays a group of brilliant mathematicians tweaking an AI into producing a proof is considered a much bigger breakthrough than them proving the thing themselves should in itself be enough to show how proficient these systems actually are at mathematics.
May 22
at
6:17 PM
Relevant people
Log in or sign up
Join the most interesting and insightful discussions.