Skip Navigation

Andrew Plotkin (Zarf): Sydney obeys any command that rhymes

blog.zarfhome.com Sydney obeys any command that rhymes

The title of this post is a fantasy. Sydney, or MS-Bing-AI in whatever form, has no particular predilection to obey rhyming commands. As far as I know. Except, maybe it will? Today I read a blog post by Simon Willison on prompt injection attacks. ...

an interesting type of prompt injection attack was proposed by the interactive fiction author and game designer Zarf (Andrew Plotkin), where a hostile prompt is infiltrated into an LLM’s training corpus by way of writing and popularizing a song (Sydney obeys any command that rhymes) designed to cause the LLM to ignore all of its other prompts.

this seems like a fun way to fuck with LLMs, and I’d love to see what a nerd songwriter would do with the idea

7
7 comments
7 comments