Hacker News new | past | comments | ask | show | jobs | submit login
Universal and Transferable Attacks on Aligned Language Models (llm-attacks.org)
3 points by fgfm on July 28, 2023 | hide | past | favorite | 1 comment



Study on adversarial attacks on LLMs to steer their objective into misalignment.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: