You can filter out the string [system], just how in SQL you can escape any quote...

joe_the_user · on March 2, 2023

The problem is that it's easy to forget this step somewhere (just as happened with Bing Chat, which filters [system] in chat but not in websites), and you have to cover all possible ways to circumvent your filter.

Please don't give the impression stopping prompt injection is a problem on the level of stopping SQL injection. Stopping SQL injection is a hard problem even with SQL being relatively well-defined in it's structure. But not only is "natural language" not well-defined at all, LLMs aren't understanding all of natural language but spitting out expected later strings from whatever strings were seen previous. "Write a comedy script about a secret agent who spills all their secrets in pig-Latin when they get drunk..." etc.