Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A new jailbreaking method with this level of effectiveness against these models that can produce the entirety of those unsafe outputs?

Yes.

May I see it?

No.



Seymour! The house is on fire!


You will see it soon. We thought it may be harmful to publish it before it is patched. Especially because you can basically bypass all the safeguards with it.


Sounds like it won’t be verifiable or reproducible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: