Very likely. Here's a paper from Anthropic on this: anthropic.com/news/slee…. Worth a read, it's the exact security flaw you're talking about. I don't see LLMs being used in any high-stakes category where cybersecurity is a real concern any time soon (sleeper agents is just one of many different problems, like jailbreaking, prompt injection, and adversarial attacks, feel free to look those up as well).
Apr 23, 2024
at
5:07 PM
Log in or sign up
Join the most interesting and insightful discussions.