> But if the passphrase is already strong (6 random words from the Diceware word...

segfaultbuserr · on Oct 9, 2019

MD5 is vulnerable to collision attacks, which allows the attacker to control both messages, m and m', and find a case where h(m) == h(m').

But if a hash, h(m), is given, finding m' where h(m) == h(m') is much more difficult, it's known as a second-preimage attack. "Image" basically means "output", "preimage" means "input", "second-preimage attack" means "find another input that has the same output already given here".

Wikipedia says a preimage attack against full MD5 still requires 2^123.4 steps (2009), only a theoretical possibility. Second-preimage should be much harder.

I don't know if there are improvements, but it's still extremely difficult. Well, of course it's not to say that you should use MD5.

comex · on Oct 9, 2019

A second-preimage attack is where you want to find m' where h(m) == h(m')... and you know m already. This is not very useful for password hashing; it would give you a second password that would also work to log into the account, but what's the point of that if you already know the first password? The relevant attack for password hashing is a regular preimage attack, where you don't know m (and it would be acceptable to find either m itself or any other string that hashes to the same value).

comradesmith · on Oct 10, 2019

You don't need to know m, just h(m) which is commonly found in database breaches

taejo · on Oct 10, 2019

That's just a "pre-image attack". A "second pre-image attack" is a different scenario, not relevant to password-hashing for the reasons grandparent described, where you already know a pre-image, and must find a different one.

anchpop · on Oct 9, 2019

It doesn't seem like it should be obviously true to me. If the hash algorithm was rot13 it would be pretty easy to determine the password from the hash regardless of the strength of the password

fwip · on Oct 9, 2019

Yep, you need both the input and hash to be strong.

A weak hash reveals information about its input, narrowing the search space. In the example case of md5 or rot13, you can use this to compute collisions for a given hash.

Also, a hash that is lightning-quick to compute is faster to brute force. That's why bcrypt has a tunable "cost" factor - to make the hashing take longer and make guessing the password slower.

segfaultbuserr · on Oct 9, 2019

I used ambiguous language, "strong hash".

I should've used "strong KDF" rather than "strong hash", a hash can be strong for other purposes, but makes a poor KDF for hashing passwords, such as single-round SHA-256.

In the ideal world, if your password is a random word with 128-bit entropy, no strong KDF is needed, there's no need for PBKDF2, bcrypt, or Argon2, a single round of SHA-256 is sufficient.

> In the example case of md5 or rot13

MD5 still has strong preimage/second-preimage resistance, unlike ROT-13.

But nobody uses random 128-bit strings as passwords, here's how key stretching and cost-factor comes to play.

schoen · on Oct 9, 2019

You could argue that ROT13 accidentally has second-preimage resistance because given m, you won't be able to find n≠m where ROT13(n)=ROT13(m). :-)

nicoburns · on Oct 9, 2019

Some quick (and uninformed) mental maths makes this ~22 random alphanumeric characters:

26 (a-z) + 26 (A-Z) + 10 (0-9) = 62 characters This which can be represented with (just under) 6 bits of information. (2^6 = 64). And 128/6 < 132/6 = 22.

I'd guess quite a few people who use password managers use password this length...

hultner · on Oct 9, 2019

Can ROT-13 really be called a hash though? It's literally an ancient chipher.

ben509 · on Oct 9, 2019

By the plain* meaning of "hash", it can't, it's a symmetric cipher.

* Where "plain" excludes a technical or mathematical definition that might include e.g. troll_hash(x) { return 9; }

ncmncm · on Oct 9, 2019

All ciphers are also hashes.

Using chaining, encipherment of the last block is also a hash of the whole input.

Secure hashes are optimized for different characteristics than typical ciphers, but with enough headroom and time each can fill in for the other.

Of course some are not very good, for either use.

comradesmith · on Oct 10, 2019

Rot13 is not a hashing algorithm. A hashing algorithm is a one-way function where many entities in the input domain map to the same entity in the output codomain. This means if you have the hash you can't determine the input with out making a guess.

Rot13 is a function with a one to one mapping between the domain and codomain. If you have the output you can apply a function to get the input.

drkstr · on Oct 10, 2019

Not sure why the downvotes. Comradesmith's assessment of rot13 is absolutely correct. Clearly rot13 is more like PGP, in that you can recover the plain text from cypher text.

myle · on Oct 10, 2019

But you don't care about finding the original password. You only care about finding a string that after applying the hash function, gives the same out.

That's why you can have a hash function like h(x) = 0, whose value gives you no information about x, and still not being able to use it.