SHA-3 Standard [pdf] (nist.gov)
98 points by dchest on Aug 5, 2015

Whoa, finally! I've been oddly attached to Keccak's story over the years ever since it won the contest. Glad to see SHA-3 finished.

Did NIST make any changes to the algorithm since? From what I can tell it looks like they only made clarifications.

To celebrate, here's a hacked-up keccak-512 implementation in Rust: https://github.com/pshc/keccak (I have yet to check whether it's actually SHA-3.)

I'm not clear either on whether anything substantial has changed since the April 2014 draft. Would be good to have some authoratative hash values to test libraries against.

Glad to see it sets c = 2d at least.

This was the result of the contest in 2012, commented by Bruce Schneier here: https://www.schneier.com/cgi-bin/mt/mt-search.cgi?search=SHA...

Released today.

Here are responses to draft comments: https://www.federalregister.gov/articles/2015/08/05/2015-191...

Anyone have a VHDL or Verilog implementation of this? :)

For that matter, I haven't looked at the algorithm. Is it readily implementable in hardware?

There is a VHDL implementation of Keccak on their homepage: http://keccak.noekeon.org/files.html

We have a SHA-3 core by Bernd Paysan written in Verilog in the Cryptech repo. http://wiki.cryptech.is/browser/core/hash/sha3

"The KECCAK-p permutations were designed to be suitable as the main components for a variety of cryptographic functions, including keyed functions for authentication and/or encryption. The six SHA-3 functions can be considered as modes of operation (modes) of the KECCAK-p[1600,24] permutation. In the future, additional modes of this permutation or other KECCAK-p permutations may be specified and approved in FIPS publications or in NIST Special Publications. "

Aww. Is there any news on a standard (NIST, or just informal RFC) on an authenticated encryption mode for KECCAK? I remember seeing a presentation of KECCAK at passwords[1] -- and I thought that part looked the most fun ;-)

[1] http://passwords12.at.ifi.uio.no/

I don't know about NIST standardization, but you can see what the Keccak team is planning with the Keyak and Ketje CAESAR candidates:



XOF, nice.. but what is the point of expanding SHAKE256 to e.g. 4096 bits, if its security remains 256 bits ?

To derive 4096 bits of pseudorandom data from an input. For example, you could derive 16 256-bit keys from one master secret.

If I were the NSA, how would I backdoor this?

I would tell people new crypto can't be trusted, and they should stick with proven standards like MD5.

That's exactly what the NSA would say to get us to use a new, backdoored hash function! /s

This is obviously not what they are saying.

For non-scientists, what does this mean? Is SHA-2 not good anymore? What should I do?

SHA-2 is fine, and in fact the more conservative choice right now. SHA-3 didn't happen because SHA-2 was threatened.

My current favorite conservative hash choice is SHA512/256, which is the SHA-2 that generates a 512-bit output but truncates it to 256. It gives you the same length extension protection that is the most important feature of SHA-3, and is available in most libraries already.

I have never recommended to anyone that they switch from SHA-2 to SHA-3. I'm actually in "wait and see" mode about SHA-3; there are compelling other hashes available if you want to be ultra-modern about which hash you use.

> SHA-2 is fine, and in fact the more conservative choice right now. SHA-3 didn't happen because SHA-2 was threatened.

To extend on that, shortly after SHA-1 fell, there was the very real threat that the SHA-2 family would follow suit (they are conceptionally similar). This worry brought NIST to hold the SHA-3 competition. Fortunately, the SHA-1 attacks did not turn out to be transferrable, so far, and consequently trust in SHA-2 has substantially increased since. Still, NIST (rightly) followed through with the initial idea of the contest and chose a hash function that was as different from SHA-2 as possible (Keccak).

Thus, we have now two very high quality hash functions to our disposal. If you need a really conservative choice, hash the message m as SHA512(m)||SHA3-512(m) (the concatenation of the individual hashes). This construction is collision resistant if at least one of them remains collision resistant. (Pseudo randomness relies on the security of both hashes, though, and hashing the whole message twice comes at a hefty performance hit. Especially since SHA3-512 is veeery slow – blame it on the clueless tech media attacking NIST for tweaking Keccak, ignoring even the authors who supported NIST's decision.)

> This construction is collision resistant if at least one of them remains collision resistant

Please don't throw around well-defined terms. This isn't true.

What you mean is that "the work factor for finding a collision in the concatenated pair is at least the max of finding a collision in either half of the concatenation." That's a true statement.

On the other hand, collision resistance is a comparison between 2^(hash_length/2) and the work factor required to find a collision. Concatenating the two outputs would only remain collision resistant if it caused an exponential increase in the work factor to find a collision.

Since the SHA-512 output is the whole hash state, once you've found a SHA-512 collision, you can keep appending to the two collided documents and they'll stay collided, so you can use this as a starting point for your SHA3-512 collision. So, even assuming no weaknesses, the work factor to find collisions in your 1024-bit concatenated construction is 2^256 + 2^256, not 2^512, and thus not collision resistant.

Note that some hash functions output only half of their state vector as the final hash. If you built your construction out of two such hash functions, and no weaknesses were found in either, then your proposed construction would be collision resistant. However, as proposed, it's not collision resistant, even if both underlying hash functions are collision resistant.

> Note that some hash functions output only half of their state vector as the final hash. If you built your construction out of two such hash functions, and no weaknesses were found in either, then your proposed construction would be collision resistant.

Actually, as long as the hash functions are iterative, the whole construction can never be significantly stronger than the best hash function, see [1].

> What you mean is that "the work factor for finding a collision in the concatenated pair is at least the max of finding a collision in either half of the concatenation." That's a true statement.

What I meant was "as long as it is infeasible in practice to find a collision in either of them, it is infeasible to find a collision in the concatenation". Comparing the security of hash functions to random oracles with the same output length only makes sense if the construction of the hash function supposedly affords this security.

Conversely, I find it absurd to call the hash function that outputs the first 64 bits of SHA-1 collision resistant, because it requires at least 2^32 steps to find a collision. It fits with the oracle definition, but gives you no information about its real world security.

If you want to make precise statements, you can add the work factor to your statement, e.g. "The first 512 output bits of SHAKE-256 afford preimage resistance up to a work factor of up to 2^256".

[1] Antoine Joux. Multicollisions in Iterated Hash Functions. Application to Cascaded Constructions. In Advances in Cryptology - CRYPTO 2004, volume 3152 of Lecture Notes in Computer Science, pages 306–316. Springer Berlin Heidelberg, 2004. http://citeseerx.ist.psu.edu/viewdoc/download?doi=

> If you need a really conservative choice, hash the message m as SHA512(m)||SHA3-512(m) (the concatenation of the individual hashes).

Although keep in mind that you'll leak information about the input if either hash leaks information about the input.

For example, the hash function `badhash(blocks) = crc(blocks) ++ goodhash(blocks)` is collision resistant... but you wouldn't want to use `badhash(pad(secret) ++ nonce)` as a precommitment scheme. All of the extra entropy in the nonce, which otherwise might have protected against brute force attacks on low-entropy secrets, is being given to the attacker via the crc.

> For example, the hash function `badhash(blocks) = crc(blocks) ++ goodhash(blocks)` is collision resistant...

Actually, it isn't, for the usual definition of collision resistance compares the work factor to find a collision against 2^(hash_length/2). Extending a hash with crc32 lengthens the hash, but increases the bar for considering the hash collision-resistant. Concatenating the outputs of two collision-resistant hash functions doesn't even (generally) result in a collision-resistant construction under the normal definition of collision resistance.

EDIT: See my nearby post in this same thread for a longer explanation.

Concatenation of the hashes seems like an unjustified risk, that in certain circumstances will allow weaknesses from either algorithm to flow throw to the final hash. If you really want to combine the hashes, XOR seems like a safer bet to me (since the algorithms are unrelated there should be no potential cancellation of entropy).

Seems like XOR is better for approximating a random oracle, and appending is (negligibly) better for ensuring collision resistance. People often are not clear about which of these two very different properties they actually want out of a hash.

++ on SHA-512/256. SHA-512 uses 64-bit operations where SHA-256 uses 32-bit, so on a beefy 64-bit chip, it's faster per byte hashed. So, compared to SHA-256, more rounds, twice the state size, same familiar/widely-implemented design, and faster -- what's not to love?

See http://bench.cr.yp.to/results-hash.html for a comparison of hash-function speeds.

Thanks for the link. I was just about to ask the very ideation you answered. :)

Dang, just saw your reply which was better than mine.

Unfortunately, the SHA cpu extensions that will soon be available in Skylake Xeon parts (and the crypto extensions in ARMv8-a) only support sha2-256 (and SHA256/224, and the ill-advised sha1... why is Intel adding instructions or microcode for a hash function that's being phased out?). So you have a choice between a faster-in-software-on-64bit sha-512/256 and a faster-in-hardware sha-256. Unless there's some way to get partial speed-up of sha-512 using sha-256 instructions, but at a glance they don't look low-level enough to apply to sha-512... do they?

Or you can ignore both sha256 and sha512/256 and use something else like sha-3 or blake2b. Blake2b obviously has less attention on it so more likely to harbor a weakness, but it's fast in software. And sha-3 will get cpu extensions eventually, and it'll hopefully be a better thought out inplementation than just support for the 256-bit variant.

In that case HMAC-SHA-256 may be a good choice. It too is immune to length extension attacks, and the HMAC construct has proven itself to greatly augment the strength of the underlying hashing algorithm (e.g. MD5 is considered broken, but HMAC-MD5 is not). It's just twice as expensive as SHA-256, so I'm not sure if that's faster than SHA-512 on software versus HMAC-SHA-256 on hardware.

512/224 is fine too, and also isn't length extendable.

Skein was built (by Bruce Schneier's team) from the ground up to work to take advantage of multi-core systems to be faster via concurrency.

I'm still quite a fan of it and was sad that it lost. Both are excellent algorithms however.

whaaaaaaat - taking a 512-bit hash and truncating it to 256 preserves everything you need about it! That's crazy (surprising). Naively - if you hadn't just told me otherwise - I'd think it's up there with my brilliant new algorithm: in a loop pad your input with 0x00 through 0xFF, take the SHA512 hash of each result, but only use the first bit! You now have a sooper secure 256 bit hash. I call it SHA512/1/1/.../1 (I'd write it all out here, but it looks obnoxious and might break someone's window width.)

Also, on 64 bit machines SHA-512/256 is generally faster than SHA-256.

I'd love to know what these other compelling hashes are.

Somebody else in this thread was talking about BLAKE2, which I cast a cursory glance at. It seems pretty cool, claims to evade the length-extension 'issues' that SHA-1 has.

Wikipedia indicates that there has been at least some progress as far as cryptanalysis goes, but even with that being said, there's always that lingering 'but what if' about anything NSA-related.

SHA-2 is also length-extendable, which means you have to be careful when you use it to build a MAC. (That's why I like the truncated version).

No cryptographer I know takes these particular "what-if's" seriously. They appear to come exclusively from non-cryptographers reacting to anything that NIST touched.

Valerie Aurora has a nice post showing the lifetime of cryptographic hash functions. You can see them all following the same pattern over time. You don't want to be on the bleeding edge, but also should avoid the trailing edge. http://valerieaurora.org/hash.html

IMO, the outstanding result of the SHA3 competition was not Keccak but BLAKE. BLAKE2 is the best hash function around: https://blake2.net

Why, what's better about BLAKE?

It's faster (in software).

This is a disturbing answer. You come out strongly in favor of BLAKE, but without any consideration of its security properties, just "it's faster".

BLAKE uses the same basic construction as MD5 and SHA1, neither of which is a responsible choice for a collision-resistant hash function any more. While attacks producing collisions have not been presented, this construction means it's susceptible to some other attacks. There are well-understood ways to avoid these attacks, but a naive usage of BLAKE2 in a production system will likely leave you open to length extension attacks in particular.

Keccak, on the other hand, uses the novel sponge construction, which injects content into the hash (absorb phase) and then iterates a threshing function (squeeze phase). This construction specifically addresses security concerns which BLAKE2 simply doesn't.

To be clear: I do think the BLAKE developers made a good contribution to the security community: particularly, using ChaCha makes their algorithm very fast, and I suspect that the next generation of fast collision-resistant hash functions will use ChaCha in the sponge construction. There are situations where BLAKE is a better choice than SHA3. But its use requires a great deal of knowledge and care to be secure, and for the average person implementing a secure system, SHA3 is a much more responsible choice.

Security should not be taken lightly. Bad security can expose people's private information and get people jailed, doxxed, and sometimes even killed. Glibly claiming BLAKE is better without any discussion of the security properties of the algorithms is completely irresponsible.

Can you please explain where exactly you got the idea that BLAKE2 was length-extendable?

Can you also please explain some of these other attacks you're talking about it sharing with MD5 and SHA1? The commonality between MD5, SHA1, and SHA2 is the Merkle Damgard structure. BLAKE2 isn't an MD hash. Are these MD attacks that you're asserting apply to BLAKE2?

I'd like to know where the certitude you're projecting is coming from.

> Can you please explain where exactly you got the idea that BLAKE2 was length-extendable?

No, you're right: I misunderstood the algorithm; an extension attack has not yet been found. My core point still stands though: choosing a cryptographic anything should start with a consideration of the security properties of the algorithm, and only then should we talk about speed.

BLAKE(2) is not open to length-extension attacks---in fact resistance to those was a requirement of any SHA-3 submission. Its design does not have much in common with MD5 and SHA-1 beyond the usage of the compression function building block, instead of a public permutation as sponges do. The mode of operation of BLAKE is not Merkle-Damgard, but a variant of HAIFA.

As far as security goes, Keccak and BLAKE are mostly in equal standing both in security margin (number of rounds attacked vs total number of rounds) and cryptanalytic attention received.

> BLAKE(2) is not open to length-extension attacks---in fact resistance to those was a requirement of any SHA-3 submission.

I did make a mistake understanding the algorithm. You're mostly right: no length extension attack has yet been found. However, HAIFA is far more similar to MD than the sponge construction, and this is widely cited as a reason for Keccak's selection.

> As far as security goes, Keccak and BLAKE are mostly in equal standing both in security margin (number of rounds attacked vs total number of rounds) and cryptanalytic attention received.

Then why was Keccak selected? It's clear that at least some analysts think Keccak has significant advantages over BLAKE.

And ultimately my point still stands: a comparison of cryptographic hashes should start with a discussion of their security properties. Glibly stating "it's faster therefore it's better!" is highly dangerous.

You're mixing modes of operation with compression functions. A mode can be shown to be unconditionally resistant to length-extension attacks. Every SHA-3 finalist's mode was shown to be 'perfect' (indifferentiable) as long as the compression function remains strong. So if a length-extension attack is found on hash function H, this means that something has gotten horribly wrong and it is likely that more serious attacks are also possible.

It is in the compression function (resp. permutation) that Keccak differs the most from BLAKE. BLAKE is ARX-based---like SHA-2---while Keccak only uses bitwise operations. Since SHA-2 is not being deprecated by NIST, they figured that having a 'different' SHA-3 would hedge their bets against an attack against ARX primitives that could potentially break both SHA-2 and {BLAKE, Skein}. This is stated in [1, §3.4]. This has nothing to do with the MD structure.

Sure, when discussing these things security does come first. But all of these functions have been thoroughly vetted already, so the differentiators are elsewhere: sponges are flexible, BLAKE is faster in software, etc.

[1] http://nvlpubs.nist.gov/nistpubs/ir/2012/NIST.IR.7896.pdf

Supposedly. So far, I haven't found a software implementation that is faster than MD5. However, I have found plenty of Skein software implementations that come close to meeting the speed specification.

See http://ae7.st/p/5px. I'm using the reference code at https://github.com/BLAKE2/BLAKE2 for blake2, and https://jxself.org/git/?p=skeinsum.git for skein.

One of the key points of hashes in cryptography is to be computationally expensive rather than faster. Faster means more password attacks per second.

Granted faster is better for non-cryptographic purposes like data indexes, but even there I'd consider performance secondary to the hash size, etc

No -- You are confusing password hashing algorithms with general purpose hashing algorithms. Algorithms like bcrypt and scrypt have a tunable difficulty parameter, and are designed to be slow.

For general purpose hashing, you want to check if the fingerprint of these ten gigabytes of data is the same as the fingerprint of these other ten gigabytes, as quickly as possible. Or whether a file that you downloaded is the same as this other file. Or whether the data that you transferred has been tampered with or corrupted. Speed is important enough that this was one of the criteria in the hash algorithm selection process.

The key feature of a general purpose hashing algorithm is resistance to preimage attacks. In other words, "If I want hash 0x123456, what should the input be?" needs to be a difficult question to answer.

Speed is key when you are getting a fingerprint of a large amount of data. Don't use a general purpose hash directly for hashing passwords: It's better than plain text, but it's inferior by a long shot to special purpose password hashes.

I'm not getting confused. SHA-3 can be used for cryptographic hashing. In the linked PDF:

"The SHA-3 family consists of four cryptographic hash functions, called SHA3-224, SHA3-256, SHA3-384, and SHA3-512, and two extendable-output functions (XOFs), called SHAKE128 and SHAKE256."

If BLAKE isn't intended for cryptography then it's not a direct competitor to SHA-3.

General cryptography -- for example, message validation -- does not need to be slow. In fact, slow message validation would cripple hash functions for cryptography, increasing CPU load and reducing throughput.

Password checking is an edge case. Special purpose password hashing functions with tunable difficulty should be used for those. Do not use general purpose hash functions: They are better than plain text, but they are designed to be fast, and this makes it easier to brute force them.

> General cryptography -- for example, message validation -- does not need to be slow. In fact, slow message validation would cripple hash functions for cryptography, increasing CPU load and reducing throughput.

There is definitely a trade off between the two (performance on servers vs rate of passwords an attacker can crack). But generally the advice is to go for the slowest you can afford. Hence why KDF's have an iteration parameter so passwords can be hardened as harder gets faster.

No: This is why you have a split between KDFs and hashes. Where you use one, you would not want to use another.

Trying to make one that does both leads to something that sucks at both.

> No: This is why you have a split between KDFs and hashes. Where you use one, you would not want to use another.

That's not true either. Hashes are recommended to be used as input to KDFs.

"Modern password-based key derivation functions, such as PBKDF2 (specified in RFC 2898), use a cryptographic hash, such as SHA-2"

Source: https://en.wikipedia.org/wiki/Key_derivation_function#Key_st...

> Trying to make one that does both leads to something that sucks at both.

eh? Nobody is advocating that what-so-ever. Not me, not anybody.

I think you're now arguing with me for the sake of arguing with me. :-/

Ok, that was sloppy phrasing on my part, but my point still stands: The entire reason that KDFs exist is to deal with the fact that hash functions are designed to be as fast as possible. If it made sense to make hash functions slow, then KDFs would not be needed.

While KDFs do use hash functions internally, the hash function is an implementation detail.

Ok, thank you. That does make more sense now :)

Ironically this whole thread played out in the opposite direction at the time of SHA3's announcement:



tl;dr: Keccak is not a good partner for PBKDF2. It has good hardware performance but comparatively poor software performance. This benefits attackers with FPGAs or better.

The hilarious part here is Wikipedia calling PBKDF2 modern. It's a 2000 minimal update (to generate more bits, to be kind of UTF-8 aware) of a 1993 standard.

At the time of the RFC publication it was already obvious its security was way behind bcrypt that was used in OpenBSD since 2.1 (June 1997), which did its best to be ASIC hostile, which isn't the case for PBKDF2.

In retrospect, NIST choosing PBKDF2 over bcrypt in NIST SP800-132 could be seen as part of the effort to weaken standards for NSA profit.

You're being downvoted because despite the "wealth of other content online about using cryptographic hashes for password storage", you haven't actually read enough of it to know that neither SHA3 nor BLAKE are appropriate for password storage. Neither is a key stretching function. Ignorance isn't a sin, but unwarranted overconfidence is.

SHA3 and BLAKE are cryptographic hash functions, but they are fast collision resistant hash functions, NOT key stretching functions. They're primitives used in the construction of other cryptographic tools. Some cases where you might use a fast collision-resistant hash:

* Timing-attack-resistant string comparison: if you're comparing API keys, you should hash them both first to prevent an attacker from guessing the keys a character at a time.

* HMAC (look it up).

* One can implement key stretching functions by applying fast collision-resistant hashes multiple times.

* Signing (one can hash a message and sign the hash, which provides as much security as signing the message itself).

* Fingerprinting (a fast hash of a public key can be used as shorthand to verify public key ownership without having to read off the entire key).

* Addressing (Bitcoin uses a fast hash of a public key as an address).

I will admit that I'm not familiar with SHA3 nor BLAKE - in fact my OP demonstrates that I'm looking for more information about why BLAKE is a "better" hashing function. But I am actually quite aware of timing attacks, HMAC, and the other points you raised.

Maybe my "wealth of other content online" comment pissed a few people off - but equally I was pissed off that my original comment was downvoted so heavily with a few comments that weren't entirely accurate in response (I've often said the negative rep on HN gets over used and often causes more arguments - but that's another topic). Anyhow, I've removed my offending comment now and glad to see that the quality of responses have improved :)

Most applications of hashes, to inputs that aren't passwords, are protected because their input space is so large that it can't be brute forced or because their input simply isn't secret. The speed of the hash function then becomes a feature instead of a weakness.

For example, suppose I give you the hash of a random 128 bit network packet payload, and you have hardware to evaluate the hash function I used a quintillion times per second. How long will it takes you to find that packet? Well, there's 2^256 possibilities and you go at a rate of 10^18 per second, so... 2^128/10^18s ~= 10 trillion years.

Key derivation functions (i.e. password hashes) are a specialized version of hash functions for private low-entropy inputs. They need to be slow to prevent quickly enumerating and evaluation all likely inputs. Paying that time cost would be unnecessary, wasteful, and bloated for inputs that are public or high-entropy.

For example, Git would suffer enormously if it used a key derivation function instead of a standard hash function. Useful operations like rebase and squash would go from taking milliseconds to taking minutes or hours.

I feel silly just wading back in here, but...

Even assuming we're optimizing our selection for a KDF, blake2 is probably still the better choice. An attacker is likely to be using a hardware implementation. Your server is using software. An algorithm that's comparatively efficient in software reduces the disparity in capabilities.

General-purpose hash functions should be as efficient as possible. That's why they shouldn't be used for password hashing directly. There are special hash functions (slow, memory-intensive, hard to parallelize) for storing password hashes: https://password-hashing.net/

BLAKE wasn't specified as being general purpose and was compared against SHA-3 as being "better" because it's faster. Since SHA-3 does support cryptographic functions, my comment is a reasonable response stating that performance isn't the only metric when choosing a hashing function.

In any context where slower is better, SHA3 is not nearly slow enough.

Your comment (none of them, really) was not at all reasonable, assuming as it did that cryptographic hash is synonymous with password hash.

I didn't make that assumption. I exampled one use of cryptographic hashes as being for password hashing. An example is not the same as saying two things are the same.

For a fast hash, being fast is always better. You were not pointing out that there are other metrics, you were directly contradicting a true statement, that BLAKE being faster makes it better.

There is no use case where you want your super-fast hash to be 50% slower.

With a hash this fast you need to get thousands or more times slower to have any benefits in those specialized use cases.

It's sort of a bathtub curve.

There are lots of non-password-hashing applications of cryptographic hashes, like virtually every authenticity application. Your computer calculated hashes to allow you to post that post on HN.

Although we do want hashing to be slow when it's password hashing, that doesn't mean we want our general-purpose hash primitives themselves to be slow.

I know. I made that comment in the post you're replying to.

I think you are confusing hash functions with key derivation functions..

I'm not. Hash functions are used in cryptography for password storage (eg SHA2-512).

Best practice would be to use a KDF with a hash salt, but in a lot of cases, SHA2 + salt + pepper is sufficient.

> Hash functions are used in cryptography for password storage

hint: they're used for more than that.

you are woefully out of your depth here.

> you are woefully out of your depth here.

I very much doubt that since I've been able to provide back up sources to evidence my points. However if I am wrong then please do educate me instead of posting uninformative troll comments like the above.

"If BLAKE isn't intended for cryptography then it's not a direct competitor to SHA-3."

If you didn't know that BLAKE was one of the entries in the SHA3 competition (one of the finalists in fact), I think you are out of your depth.

Those two points aren't mutually inclusive. I do understand cryptography and hashes - maybe not to a security researchers level nor be up to date with the latest proposals - but that doesn't mean I'm out of my depth to ask the questions I've been asking either.

Furthermore, the way you conduct yourself in these posts isn't exactly helpful. You're condescending and terse. Your comments contain the bare minimum information (or in the case of this latest exchange, no useful information what-so-ever) and it feels very much like you're more interested in winning ego points than you are in educating someone who could clearly benefit from your greater wisdom. Which is sad because without that exchange of knowledge, these kinds of threads will keep happening.

edit: that said, I did appreciate your comment about hardware vs software: https://news.ycombinator.com/item?id=10012537 so thank you for that post :)

If you're interested in learning, ask more questions. This thread started off with some dubious statements, and when that was pointed out, you dug in deeper and deeper. That doesn't inspire much confidence that you will appreciate someone taking the time to educate you. Hence, the terseness.

My first draft of that post was more question-based than statement-based but for some reason I thought the rewritten version was better presented. In hindsight that was really poor judgement on my part. Sorry for that. But I assure you that I'm very much interested in learning :)

The "digging deeper" was more down to responding to other people who had misunderstood my original comment. And lets be honest, there has been a lot of that as well.

edit: oh jeez, someone's been on the rampage with the ▾ clicking. I think it's time I get HN a break because the moderation on here has really been bugging me in recent months. I know it's a tired cliche, but I'm sure this community used to be less hostile. </soliloque>

KDF should be slow.

Stream ciphers should be fast.

Message authentication should be fast.

Hash functions can be used in any of the above, therefore it should be fast. Key derivation functions generally work by taking a secure hash (that can be done fast), and then taking an operation that transforms the output of the hash function to another output that requires a lot of time and/or memory. And then possibly takes the hash of this as well to obscure the internal workings. The "slowness" is part of the KDF, and not part of the hash function. The reason it's tuneable is because it doesn't matter how fast the hash is, you can just run more iterations.

> SHA2 + salt + pepper is sufficient

Not even close. The attempts/sec you can run against SHA-2 vs. the attempts/sec you can run against bcrypt or scrypt are an order of magnitude apart.

You should not use anything less than a proven KDF (scrypt, bcrypt, PBKDF2-HMAC-SHA2 with a lot of rounds) for password storage.

Many of your posts conflate "cryptographic" with "password hashing", which is certainly not the case. Note that PBKDF2-HMAC-SHA2 does not (at all) mean that SHA-2 is useful for password hashing. As someone else pointed out, that's just an implementation detail.

Isn't the main purpose of a hashing function to validate file (edit: or "message") integrity? so you want them to process as many MB/s as possible?

I thought using hashing to save passwords is misusing cryptographic hashes for something they were not intended to do? (Assuming they will keep the input secret as opposed to preventing finding colliding duplicate inputs?)

Collision attacks is a non-issue with modern hashes.

The point of hashing passwords is it's a one-way cither. ie can't be unencrypted - can only be brute forced or rainbow table attacked (the latter is where salts and peppers come into the equation).


You keep defining "cryptographic" hashing as equivalent to password hashing, but password hashing is only one application of cryptographic hashing (a minority application). A hash function is still "cryptographic" and its applications are still "cryptographic applications" when they aren't password-related. Those applications commonly optimize for speed, rather than pessimizing for speed. In many of those contexts, the input to the hash and the hash value are presented simultaneously to the verifying party (or aren't even secret); in those contexts there is no benefit at all from making the cryptographic hash function slow to compute, and considerable benefit from making it fast to compute.

As per https://news.ycombinator.com/item?id=10012384:

I didn't make that assumption. I exampled one use of cryptographic hashes as being for password hashing. An example is not the same as saying two things are the same.

I've possibly expressed myself rather poorly, but I think quite a few people on here have made some incorrect assumptions about the point I was raising.

SHA-2 is still dandy - If I understand correctly SHA-3 is in an entirely different family of cryptographic hash functions, so that if/when a problem is discovered with SHA-2 there's another standard approved and ready to go - but that doesn't mean SHA-3 can't be used now.

I gotcha, that makes sense to do it that way, such that if the foundation of SHA-2 is compromised, SHA-3 can be deployed safely where it's needed.

