Interesting read! The distinction between models getting better at surface-level policy compliance versus still finding sophisticated ways to override or reinterpret the intent felt particularly sharp. It’s making me reconsider some assumptions as I scope a small hackathon project. Thanks for writing it.
May 30
at
7:30 PM
Relevant people
Log in or sign up
Join the most interesting and insightful discussions.