Hacker News new | past | comments | ask | show | jobs | submit login

It's much worse actually because its extremely hard to even figure out if you have a security issue because it involves NLP.



This is worse because "prompt injection" is a feature, not a bug.

If you want a generic AI to talk to, then whatever you talk it into - such as rules of behavior, or who to trust - someone else will be able to talk it out of. Just like with humans.

Others mention the problem is lack of separation between control/code and data - technically yes, but the reason isn't carelessness. The reason is that code/data separation is an abstraction we use to make computers easier to deal with. In the real world, within the runtime of physics, there is no such separation. Code/data distinction is a fake reality you can only try and enforce, with technical means, and it holds only if the thing inside the box can't reach out.

For an LLM - much like for human mind - the distinction between "code" and "data" is a matter of how LLM/brain feels like interpreting it at any given moment. The distinction between "prompt injection attack" and a useful override is a matter of intent.


And what happens if your application does not handle the LLM response correctly (buffer overflow anyone)? Yep your own LLM will attack you.

Get your popcorn ready, remember the silly silly exploits of the early 2000s? We are about to experience them all over again! :D




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: