ToxSec (@toxsec): "Most engineering teams now generate code with one LLM and review it with another LLM (or the same one). Two checks. Feels like defense-in-depth. It isn't. Researchers found that LLMs fail to correct errors in their own output 64.5% of the time. The same training distribution th…"

The app for independent voices

Most engineering teams now generate code with one LLM and review it with another LLM (or the same one). Two checks. Feels like defense-in-depth.

It isn't. Researchers found that LLMs fail to correct errors in their own output 64.5% of the time. The same training distribution that produces SQL injection (CWE-89), SSRF (CWE-918), and XSS (CWE-79) prevents the model from flagging them at review.

The fix isn't a better prompt. It's understanding why statistical models can't do adversarial reasoning, and building a pipeline that accounts for it. I broke down the three failure mechanics and what actually catches these vulns before prod.

Guest post on The AI-Augmented Engineer by Jeff Morhous!

The AI-Augmented Engineer

AI code review fails to catch AI-generated vulnerabilities

Apr 7

3:00 AM

The app for independent voices

Log in or sign up