Hacker News new | past | comments | ask | show | jobs | submit login

That’s a lot of words to say something like “if an LLM asks you to execute some code it could be dangerous”.

Seems somewhat obvious. Would you execute random code from a ‘friend’ who emailed you? Does the LLM have any nefarious intentions?




No, that isn't quite what it is saying. The LLM is simply running itself recursively on a task that you've assigned, which is the basic premise of agent models like Auto-GPT.

Turns out that's dangerous. But too bad, it's also very useful, so it's going to be done, safe or not.


Running recursively on a task sounds a lot like executing code.

Let’s say we instead evaluate infiniteMonkeyBot. infiniteMonkeyBot simply issues random commands to a Unix prompt. Hypothetically this system is horrendously bad - it could potentially launch atomic weapons if we happen to connect those to the same system.

However, both infiniteMonkeyBot and our scary LLM are unlikely to be connected in this way and lack any desire or understanding necessary for nefarious behavior.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: