Looks like we got a new DeepSeek model over the holidays (again):
Basically pushes RLVR & self-refinement to gold-level scores on IMO 2025.
Coincidentally, I am currently working on a chapter on self-refinement, and this comes in handy as a nice, scaled-up case study.
github.com
Contribute to deepseek-ai/DeepSeek-Math-V2 development by creating an account on GitHub.