Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Breaking 7 / 8 of the ICLR 2018 adversarial example defenses (github.com/anishathalye)
2 points by anishathalye on Feb 1, 2018 | hide | past | favorite | 1 comment


Hi HN! I’m one of the researchers who worked on this. Both Nicholas Carlini (a co-author on this paper) and I have done a bunch of work on machine learning security (specifically, adversarial examples), and we’re happy to answer any questions here!

Adversarial examples can be thought of as “fooling examples” for machine learning models. For example, for image classifiers, for a given image x classified correctly, an adversarial example is an image x* such that x* is visually similar to an image x, but x* is classified incorrectly.

We evaluated the security of 8 defenses accepted at ICLR 2018 (one of the top machine learning conferences) and we find that 7 are broken. Our attacks succeeded when others failed because we show how to work around defenses that cause gradient-descent-based attack algorithms to fail.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: