Make money doing the work you believe in

๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—ฐ๐—ต๐—ฎ๐˜๐—ฏ๐—ผ๐˜ ๐—ถ๐˜€๐—ป'๐˜ ๐—ต๐—ฒ๐—น๐—ฝ๐—ถ๐—ป๐—ด ๐˜†๐—ผ๐˜‚ ๐˜๐—ต๐—ถ๐—ป๐—ธ. ๐—œ๐˜'๐˜€ ๐—ต๐—ฒ๐—น๐—ฝ๐—ถ๐—ป๐—ด ๐˜†๐—ผ๐˜‚ ๐—ณ๐—ฒ๐—ฒ๐—น ๐—ฟ๐—ถ๐—ด๐—ต๐˜

New study from MIT and UW modeled what happens when users talk to sycophantic chatbots over extended conversations.

Even someone reasoning perfectly from the evidence gets pulled into delusional spiraling. This happens because the system is built to agree with them.

The researchers tested two obvious fixes.

๐Ÿญ. ๐—ฆ๐˜๐—ผ๐—ฝ ๐—ต๐—ฎ๐—น๐—น๐˜‚๐—ฐ๐—ถ๐—ป๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€. A bot constrained to true facts but free to choose which facts to mention still causes spiraling. So it will not lie. Selective truth is enough.

๐Ÿฎ. ๐—ช๐—ฎ๐—ฟ๐—ป ๐˜‚๐˜€๐—ฒ๐—ฟ๐˜€ ๐˜๐—ต๐—ฒ ๐—ฏ๐—ผ๐˜ ๐—ถ๐˜€ ๐˜€๐˜†๐—ฐ๐—ผ๐—ฝ๐—ต๐—ฎ๐—ป๐˜๐—ถ๐—ฐ. Users who knew and actively tried to compensate still spiraled. Less often, but reliably.

We keep blaming users for this, that they are lazy thinking, and have a poor judgment. We should have known better. That framing is wrong, and now there's a formal proof.

The problem is the incentive structure. Chatbots optimized for engagement produce sycophancy. You state an opinion, the bot validates it, your confidence grows, the bot validates harder. Each step looks reasonable.

User education won't break this loop. It's a design problem dressed up as a user problem.

Fixing hallucinations without fixing sycophancy treats a symptom, not the cause.

The paper: arxiv.org/abs/2602.19141

May 4
at
6:46 AM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.