Here's an interesting visual reasoning benchmark at which 3-year olds apparently handily beat all frontier models. xbench.org/agi/babyVisi…