The importance of this NIST standard cannot be understated. It has a requirement that passwords be hashed with a random salt! Microsoft Active Directory does not use salts!
I am trying to imagine the consequences to all the businesses and agencies that must adhere to these standards suddenly coming to the realization that they must replace their Active Directory installations and what that will mean for administering all their Windows systems. It's not going to be pretty.
You might be thinking, "but these are just guidelines!" Yes, but actually there's loads of existing business contracts with wording that states the business must adhere to all NIST (security) guidelines. Also, you can bet that once this gets finalized other standards will follow suit.
It will be interesting to see what Microsoft does because salts were expressly not added to AD because so much functionality cannot work with random salts in place. They're going to have to break backwards compatibility with a lot of functionality and many 3rd party products that synchronize with AD. Entirely new APIs are going to need to be written.
It will also become seriously annoying to create keytabs at large organizations since only domain admins will have that power once random salts are in place; making automation very difficult.
Presumably Microsoft would add a salt before losing all that business. Perhaps only if you turn on special NIST-mode, in the way of microsoft and backwards compatibility.
It might break some functionality, but probably not as much as moving to an entirely different product if you have things written against MS APIs.
This is a great example of how smart standards can leverage incredible pressure against vendors to adopt best practices.
What will happen is that if the new standard is accepted as is, every single implimentation will have to craft a Plan of Action and Milestones (POAM), and every security officer is going to write in "Waiting for vendor fix". That's a lot of pressure on Microsoft to change this.
Will it take time? Certainly. Years in fact. But if you want massive change, this is how you start.
I'm a big fan of Active Directory. But I applaud this move. I'm even more excited to see if this will filter down to banks. Probably not as quickly, but one can hope.
That is pretty much how they worked the Posix compliance requirement for government/forces use for NT/2000: the API was available through an optional (and IIRC not quite complete) module that very few ever turned on even where it was a contractual requirement that it be present.
The option you suggest could be present but no doubt not be in use by many people because extension-X, third-party-app-Y, or internal-automation-subsystem-Z, or some feature of AD itself, would break.
Do those contracts really require adherence to NIST standards, including whatever future changes are made to them? I would have thought they'd be restricted to the guidelines as they are at the time the contract is signed.
The problem is they are basically saying that almost every "best pratice" recommended over the past 5-10 years is absolutely the wrong thing to be doing. Don't be surprised when people throw up their hands in frustration.
It's been longer than 10 years, and known for quite some time by most.. The xkcd article is over 5yo itself, and a lot of people pushed for more open passwords for ages...
I do one controversial think and that's trim password imput (mainly because of trailing whitespace selection in some apps/oses). Other than that, if you can input it, you can use it... though now doing some unicode normalization for unity combos is probably a good idea prior to hashing.
That comic is unfortunately not telling the truth. The password phrase IS 44 bits of entropy, assuming you input random ascii. But any reasonably knowledgeable person trying to crack passwords, will use a dictionary to create a passphrases, rendering this less useful than the first password. Even if you do substitute o/0/ø, i/l/1, a/4, ect randomly. You still need a rather long sentence, and preferably spice it up with simple substitution - now this passphrase is no longer easy or simple to remember.
If you choose four random words from a list of 2048 common words, and your attacker knows that's what you're doing, then your entropy is 4 * log_2(2048) = 44 bits. If the attacker didn't know your strategy and tried to brute force letter by letter it would be much higher - around 48log_2(26)=150 bits assuming around eight letters per word - but like you said, we should assume the attacker knows exactly what strategy we're using, so 44 bits is the better number to work with.
Since that's still more than 28 bits, a 'correct horse battery staple' type password is harder to crack then a short random string of mixed characters, even if the attacker knows exactly what your password generation strategy is.
I was a bit hasty of the entropy of the passphrase, my mistake. I still stand by that even if we choose from 2048 common words, generating a good passphrase (one that isn't a common sentence) is harder than we think.
Yes, 2048 is tiny. I've been using a 4096-word dictionary I found online for years, along with my family and two kids since they were about 6 years old.
There is absolutely no trouble with a 4096 word dictionary. Yes, they (and me too) sometimes bump into words we don't recognize, but it's not that common.
Here, I just generated you a few passwords:
* hefty march attempt force bowel scuff
* between sepia book sweat lemma saint
* safe warn magical cask hefty wish
* alum glib puck adieu dour lazy
* telephone pine cavort good knee swank
* numeral plan jewel conch slate tube
* pastry piano sure proxy unit brew
* trig rise taint current sans gallop
Here is the same random numbers but encoded into ascii instead of words:
* 81Pk3t?Rq6S}
* ]CPcYrT^?iE3
* +qV`J9ZU&.,C
* `>sp=~V);3g>
* E&_ff7a|Z4B[
* ?OX~[J>0K'S*
These each have the exact same amount of entropy as the word-based ones.
Yes and good luck writing those passwords like "between sepia book sweat lemma saint" without typos :)
I just watched a friend using some kind of long password and it took about a minute to get the password correctly entered. Easy to look at the keyboard also when he was typing and guess the words used.
There is not 3000 _core_ words. You don't teach elementary school children 3000 words. That list is significanly smaller. In Denmark it's 120 words, then you'll be well on your way to reading and writing most basic stuff.
That someone has selected 2048 words used to generate passphrases, doesn't make it easy to remember.
Eh, you may want to peek at the XKCD things explainer to see what is life under 1k words.
Everybody uses more than a few thousand words on at least one language. And selecting the least used ones will make your passprhases easier to memorize, because they have much more concrete meaning (by virtue of their rarity) than the most used words.
I've been doing this for years and it is indeed easier to remember. You won't be keeping all your passwords in your head in any format if you're using a unique password per site or service as you should. But you occasionally have to buffer them mentally between your password manager and the input box (especially on mobile), and the same number of bits of entropy are infinitely easier to copy correctly in this format vs something like "?G[G6n|4".
Err.. I would suspect most(non-english?) people on the internet to know 2000 words. (By virtue of being atleast bi-lingual, they would, so the challenge boils down to can they type it? unicode support should help, but I've found most cases people simply type those sounds in English.)
As for the English as a primary language people, I've no clue about the number of words, but if we can expect them to type 1024 words, we'd still get 20 bits..
The GfyCat URL generator list is about ~1800 animals and ~8000 adjectives [0]. All fairly memorable words. Add in other types of words and 2048 quickly becomes a small number.
The words don't have to be incredibly simple, either. Grabbing a random GfyCat image from the front page gives me "OrangeLankyBasilisk" which is easy to remember, none of those words are particularly foreign. But they're not in your list of most common words, nor are any of them in XKCD's simplewriter [1], which keeps track of the 1000 most common words.
Edit: The words need to be randomly chosen by a computer, so it doesn't matter what the most common words are. You could generate the password from a list of 2048 spanish words or japanese words or emojis and the entropy is the same as the previous scenario (assuming the attacker knows your dictionary of symbols just like they'd know your dictionary of English words). If you let a human choose, of course the smiley-face and poop emojis are going to be picked 50% of the time, but that's not the intention of the comic or passphrases.
You can if it's a computer picking the password from presumably common words.
If the human is picking anything than there WILL be a bias in selection. Effort should be made at minimizing that, but even with education this is a difficult task for any worker.
It got to the point where I actually took a classic literary work and made a 'password words' dictionary from it just so that I could have the computer generate possible new passwords. (there bias is mostly filtering out things that might be offensive... because someone can always take it the wrong way even if you explain in advance the theory of how the password was created by a computer).
I've been pretty happy using diceware[0] for my word-based passwords. Just be sure to use actual dice and not a software based random number generator if you want it to be truly random.
Mix in some other languages or slang words, and a number is always possible and easy to remember. You may choose to replace the space consequently with a comma or ; or another easy to type character (not one that requires a shift if you keep everything lowercase).
> If you choose four random words from a list of 2048 common words, and your attacker knows that's what you're doing, then your entropy is 4 * log_2(2048) = 44 bits. If the attacker didn't know your strategy and tried to brute force letter by letter it would be much higher - around 48log_2(26)=150 bits assuming around eight letters per word - but like you said, we should assume the attacker knows exactly what strategy we're using, so 44 bits is the better number to work with.
> Since that's still more than 28 bits, a 'correct horse battery staple' type password is harder to crack then a short random string of mixed characters, even if the attacker knows exactly what your password generation strategy is.
Typo in the above but can't edit on my phone, the second sum should be 4 \* 8 \* log_2(26). Apologies.
No, you are wrong. The number of bits of entropy in "horsestaple..." is estimated by assuming the words where chosen at random from the 2^11=2048 most common words. 4*11=44 bits in total. In practice it is even better since a hacker would also try different kinds of passwords! So no, you do not need substitute characters.
Yes, I was wrong about the entropy when writing that. But I still don't think that passphrases are as godsend as the comic make it seem. Can we really assume 2048 common words? The 100 most commonly used, make up 50% of written words.
A common sentence like "I drove to the mall yesterday" is not a good passphrase, but I'm certain that people who use "rocket" as a password would do something similar.
The intention is that the random words are selected from a list of 2000 unique, common words.
Choosing a sentence is a different strategy, which is less secure.
$ wget -O ⅓Mwords http://norvig.com/ngrams/count_1w.txt
$ for i in `seq 10`; awk '/^[a-z]{3,}/ { print $1 }' ⅓Mwords | head -n 2000 | shuf -n 5 | tr '\n' ' ' && echo
videos possible disease maintenance chair
teen documents than without son
research interface library largest drive
location ball beauty coming files
files middle fri meet air
guarantee samsung click super inn
legal previous rent resort use
reply thought better fresh phentermine
bad command once vehicle australian
fun random professor course sponsored
I'm not suggesting that 20 random characters is easier to remember, but for average Joe, it might as well be the same. Not only do they have to remember the words, the sequence, and how to spell them. Unfortunately we cannot expect this from users in general - the worst offenders write down a password like "rocket", so there is no hope that they'll try to remember a sequence of random words.
We shouldn't have remember passwords at all IMO. It's creating entropy by remembering things, but the human brain is inheritly bad at remembering exact things. Things like a yubikey is a better idea, plug it in, enter your pincode, and use a key pair to authentication. All the user have to do is keep track of the physical thing and the pincode.
Even those 44 bits are too little nowadays. Passphrases are not a godsend, but something good to use when the correct technology - a password manager - is not available.
A notable use case is choosing a master password for your password manager. And you'll want a longer phrase.
the idea is you use a mnemonic generator to pick the words. The fact that "100 most commonly used, make up 50% of written words" (a dubious statistic, source?) is irrelevant.
> the idea is you use a mnemonic generator to pick the words.
I know you are supposed to use a generator to pick the words, that is how BIP39 for bitcoin works. But average Joe is not going to do that. He will select "I went to highschool in 1992". Authentication is a hard problem, and unless you force a reasonable scheme, it will be weak.
I think you failed to think that through. Random alphanumeric input of that length would be 25log(36)=525=125 bits. It is 11 bits per word because it is chosen from a ~2000 words dictionary.
Edit: should have reloaded, said by enough people already :D
"best practice" according to whom and "recommended" by whom? Maybe by security theater or other sources of BS, but not security experts (who are worth their, ahem, salt).
additionally, there are rumors (maybe more than that now) that the next round of HIPAA/HITECH /etc... regulations are going to adopt a lot of NIST standards and frameworks. So, this could potentially trickle down to the Healthcare industry.
No it cannot since the authenticating client won't know the salt.
NTLM isn't the only authentication method that is fundamentally incompatible with salts. The way Windows clients perform initial Kerberos authentication (getting the TGT) will break as well.
Kerberos is easier to fix though since the protocol was made to work with salts. It's just that the way Microsoft implemented it leaves it vulnerable to pass-the-hash styled attacks.
Another issue is how applications handle incoming Kerberos authentication. For example, when you configure IIS to perform Kerberos authentication you (usually) need to specify a service account and provide the password. Instead of storing the password IIS will simply pre-compute the hash (which it can do because there's no random salt) and store the result using DPAPI (I think... Since it's a Microsoft product it may use a different nonpublic API). If using random salts IIS won't be able to do that anymore and will have to work like traditional MIT Kerberos applications--using a keytab (provided by an administrator).
If a salt matters, that means you can look it up in a dictionary of hashed passwords. Which means its in a very small search space. Which means a exhaustion search would find it quickly anyways.
Far more relevant than "salts" would be using a PBKDF (See PBKDF2, SCRYPT, BCRYPT - which come with a salt for free anyways) with an appropriate number of iterations.
What was the last year that Rainbow Tables were relevant anyways - 2010? Earlier?
Even if you assume weak passwords, there's no comparison in effort when you involve stretching.
No salt means you can precompute the hash for the 1,000,000 most common passwords. That'll take maybe an hour or two with aggressive stretching. Then you check it against the million or so records in the database for any matches.
Salt means FOR EACH RECORD in the database, you have to compute those 1,000,000 passwords. That means you're talking about decades of computation time. That's a heck of a lot more expensive. And importantly, you have to start that computation AFTER the breach, because that's when you gain access to the salts.
> Salt means FOR EACH RECORD in the database, you have to compute those 1,000,000 passwords. That means you're talking about decades of computation time.
SHA256 of the million most common passwords (with a 128 bit salt) takes 2 seconds /total/ on my laptop. That's only 42 days required to crack a database of one million hashes.
var stopwatch = Stopwatch.StartNew();
var salt = "0123456789012345";
int crackCount = 0;
foreach (var password in File.ReadLines(@"c:\temp\10_million_password_list_top_1000000.txt"))
{
System.Security.Cryptography.SHA256.Create().ComputeHash(Encoding.UTF8.GetBytes(password + salt));
++crackCount;
}
stopwatch.Elapsed.Dump();
crackCount.Dump();
=>
00:00:02.0981333
999999
Salt is indeed important, but the only real way to protect is to iterate - PBKDF2/scrypt/bcrypt.
I think in the real world there are two scenarios, someone is using one of PBKDF2/Bcrypt/Scrypt - in which case all of this is moot, set an appropriate work factor/#of iterations, and you are secure with even a moderate sanity check against the password. Interesting note - said sanity check, ironically, could be a simple as a lookup - the top 4 billion passwords can be stored in a packed 24 gigabyte lookup table on disk which can be searched in < 100 milliseconds).
The other scenario is when you doing a single pass of a hash, in which case the salt is irrelevant for the security of that password.
Everyone here understands that you need to salt when your dictionary takes a long time to build (Say, more than 1 millisecond/password, which equals 2.5 billion passwords/month) - not everyone appreciates that salting a fast password (more than 2.5 billion passwords/second) - adds no security to that particular password.
The scenario where there is moderate "stretching" (by which I presume you mean running multiple iterations of a hash), with no randomized salt, is a bit of a straw man - who would bother to go the effort of "stretching" and not stick a randomized salt in while they are at it?
So - I think we basically agree with each other, that for a straightforward single round of a hash like SHA256, that a salt is now irrelevant (reason - a GPU cluster can check 10s of billions of passwords/second)
I think the point I was trying, and perhaps failing to make, is that the three PBKDFs - PBKDF2/BCRYPT/SCRYPT - all come with a salt anyways - so you don't really need to call them out.
What I guess I should have made explicit, (and I didn't) is that if all you are doing is a single round of a SHA - then adding a salt at the beginning isn't going to make that password any more secure. If it could fall to a Rainbow Table Lookup, then it will fall in pretty much the same amount of time to a password-cracker - on the order of milliseconds.
Now this is something which i don't understand. Is this really the best we got?
If you somehow get into a server with those hashed passwords, you'll probably try to mirror it and everything on the server, to figure out the hashing mechanism. So lets say, time is not an issue, also regarding long lasting APT's.
So what will you do with it? Why should you try brute force a million passwords, each one taking 2 hours to decrypt? No, you're smarter than that: Why not create a rainbow table of possible admin account, vips, state actor, etc. How many are those? 100? 1000?
Nice, you just reduced the search space by a factor of 1000.
That whole circus about hashing... I know, randomly salted hashing makes sense, but true secure password storage is a myth.
I am trying to imagine the consequences to all the businesses and agencies that must adhere to these standards suddenly coming to the realization that they must replace their Active Directory installations and what that will mean for administering all their Windows systems. It's not going to be pretty.
You might be thinking, "but these are just guidelines!" Yes, but actually there's loads of existing business contracts with wording that states the business must adhere to all NIST (security) guidelines. Also, you can bet that once this gets finalized other standards will follow suit.
It will be interesting to see what Microsoft does because salts were expressly not added to AD because so much functionality cannot work with random salts in place. They're going to have to break backwards compatibility with a lot of functionality and many 3rd party products that synchronize with AD. Entirely new APIs are going to need to be written.
It will also become seriously annoying to create keytabs at large organizations since only domain admins will have that power once random salts are in place; making automation very difficult.