Eric Stiens (@ghostintheweights): "Posted a hot take on RLHF in models scrubbing citations from output and even giving a completely wrong take on the evidence base until you use the magic words that signify you as credentialed. Also doubles as a critique of the mental health safety banners now rolling out across …"

Make money doing the work you believe in

Posted a hot take on RLHF in models scrubbing citations from output and even giving a completely wrong take on the evidence base until you use the magic words that signify you as credentialed. Also doubles as a critique of the mental health safety banners now rolling out across anthropic and open ai platforms. ghostintheweights.subst… #aisafety #rlhf #mentalhealth #evidencebaseddesign

Eric Stiens

Role-Based Reality: How AI Withholds Life-or-Death Information Unless You Know the Magic Words

Dec 24

7:46 PM

Make money doing the work you believe in

Log in or sign up