they built a “magic string” so developers could test refusal behavior.
attackers (or anyone with write access) can drop it into:
RAG documents
Tool outputs / file reads
Support tickets
PR descriptions
Shared chat history
… and the model just stops. No output. No error. stop_reason: "refusal".
Worse? The refusal sticks in the context. Retry logic turns it into an infinite silent loop until a human manually purges the poisoned data.
It’s a documented feature turned one-line DoS.
zero payload tuning required.
the simple fix in the article.