Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

All modern computer security is based on trying to improbabilities. Public key cryptography, hashing, tokens, etc are all based on being extremely improbable to guess, but not impossible. If an LLM can eventually reach that threshold, it will be good enough.


That threshold would require more than 30 orders of magnitude improvement in the probability given a 1/100,000,000 current probability of an LLM violating alignment. The current probability is much, much higher than that, but let's cut the LLMs some slack & pretend. Improving by a factor of 10^30 is extremely unlikely.


Cryptography's risk profile is modeled against active adversaries. The way probability is being thrown around here is not like that. If you find 1 in a billion in the full training set of data that triggers this behavior, that's not the same as 1 in a billion against an active adversary. In cryptography there are vulnerabilities other than brute force.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: