Most engineering teams now generate code with one LLM and review it with another LLM (or the same one). Two checks. Feels like defense-in-depth.
It isn't. Researchers found that LLMs fail to correct errors in their own output 64.5% of the time. The same training distribution that produces SQL injection (CWE-89), SSRF (CWE-918), and XSS (CWE-79) prevents the model from flagging them at review.
The fix isn't a better prompt. It's understanding why statistical models can't do adversarial reasoning, and building a pipeline that accounts for it. I broke down the three failure mechanics and what actually catches these vulns before prod.
Guest post on The AI-Augmented Engineer by Jeff Morhous!