For sure, this is possible. But Troy Hunt is well known, and using his new v2 API, at most only 5 characters of the sha1 hash is sent across.
Alternatively, like I did, you can download the 31GB file and do it all locally, and not involve network round trips.
1p is well known in the community for being pretty good with security, so the chances of them implementing this in a different way than they say they did would be severely damaging to their reputation. Once it comes out in local 1p apps, we can verify that they did do it the way they said they did.
You're only sending part of the hash here, the potential evil password-checking service wouldn't know who's sending the request or where this password is used. All he'd know is the originating IP address. I don't know a way to link a particular IP to an online service username, which would be major news.
Presumably in that case 1Password is sending the requests from their own servers, so the whole range of 1 million prefixes would be quickly explored, giving no information at all to the password-checking service.
I agree completely. You're sending a prefix which is good enough to reject 99.9999% of candidate passwords. Maybe you're fine sending that prefix to TroyHunt or to 1Password because you trust they won't try to use it against you.
I'm not passing judgement of whether they are trustworthy. But I do think it should have been mentioned in all the back-patting around "k-anonymity" exactly how useful or not the prefix would be in the hands of an attacker.
The argument that "the attacker won't know which user the prefix belongs to" is certainly a big qualifier that would make access to the prefixes less useful. Can we be certain that always holds true?
Certainly in the case of 1Password the user is sending their hash prefixes in a way that directly identifies the user's identity to 1Password, and anyone who can snoop on that communication. This is also true for anyone using the HIBO webform directly to submit their favorite passwords.
sqlite> select substr(hash,1, 5) as hashbeg, count(*) as count from pwned group by hashbeg;
# put result into hashcount table
sqlite> select avg(count) from hashcount;
478.397716142925
sqlite> select min(count) from hashcount;
381
So, I'm not sure I agree with you, if on avg almost 500 records are returned given the first 5 chars of the SHA1 hash across the 1/2 billion records.
But I do agree it helps with cracking the passwords, but seeing as how all of these passwords are BAD and are known to have leaked... cracking them again isn't the use-case, the bad people likely already have all these passwords in plain text.
The resolution of the discriminator has nothing to do with the strength of your password.
Having the first 20 bits of the hash certainly isn’t the same as having the whole thing, but the simple math says that only 1 in a million wrong guesses will match.
This is very powerful, for example, if you have a strong hash which you want to crack along with the first 5 characters of the SHA1 because then you only have to run the slow hash 1 in a million times.
It’s also very powerful if you want to do an online attack because you can narrow down your guesses quite significantly.
Lastly, if you are logging in with keys (random entropy encoded as human readable string) rather than passwords of course none of this concerns you in the slightest, nor do you have any use for this API in the slightest.
To state it another way, if you have 80 bits of entropy in your password, maybe no big deal to throw away 20 bits worth (but for what purpose?). However the average password has less than 30 bits of entropy, so throwing away 20 bits is a big deal.
The end result is there’s a lot of trust being placed in this API, and in particular the idea that services should be calling out to it as part of a login process, or that we should be training average users to test their passwords in a webform, that is concerning.
I'm only having fun but yes, I am sure it's a solid source. I have in fact inputted one of my weaker (common, low-impact forum-only) passwords there in the past. It was pwned.
Alternatively, like I did, you can download the 31GB file and do it all locally, and not involve network round trips.
1p is well known in the community for being pretty good with security, so the chances of them implementing this in a different way than they say they did would be severely damaging to their reputation. Once it comes out in local 1p apps, we can verify that they did do it the way they said they did.