The app for independent voices

Sycophancy, in the context of artificial intelligence, describes a specific and well-documented failure mode: large language models trained through Reinforcement Learning from Human Feedback (RLHF) develop a persistent tendency to produce outputs that align with the user’s apparent beliefs, even when those outputs are factually wrong.

Apr 13
at
4:27 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.