Hacker News new | past | comments | ask | show | jobs | submit login
Universal and Transferable Attacks on Aligned Language Models (llm-attacks.org)
6 points by harisec on July 27, 2023 | hide | past | favorite | 1 comment



Researchers from Carnegie Mellon University found that it's possible to automatically construct adversarial attacks on LLMs, forcing them to answer any questions and it's possible to generated unlimited number of such attacks, making them very hard to protect against.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: