Universal and Transferable Attacks on Aligned Language Models | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		Universal and Transferable Attacks on Aligned Language Models (llm-attacks.org)
		6 points by harisec on July 27, 2023 \| hide \| past \| favorite \| 1 comment

harisec on July 27, 2023 [–]

Researchers from Carnegie Mellon University found that it's possible to automatically construct adversarial attacks on LLMs, forcing them to answer any questions and it's possible to generated unlimited number of such attacks, making them very hard to protect against.

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact