Google has unveiled a new technology called “Agentic Vision” in Gemini 3 Flash.
It’s a new capability designed to improve Gemini’s image analysis. Instead of analyzing an image once and guessing at details it might miss, the model uses a “Think, Act, Observe” loop where it formulates a plan, writes and executes Python code to manipulate images (cropping, zooming, annotating, measuring), and then observes the results before providing an answer.
For product teams building AI-powered features where users upload artifacts this could improve the accuracy of those features.
For example, e-commerce platforms could verify product listings by counting items and confirming the condition of an item with annotated proof, or insurance systems could assess damage claims by cropping and labeling each affected area.