I guess this is a good primer for some folks, but there are two really spurious arguments:
(1) Your working vocabulary of 20k words is irrelevant. If you pick the words yourself, they aren't random and your random-word-vocabulary (the ones you'd actually pick) is waaaay less. If you use a randomizer, which you should, just give it the full dictionary (or say, top 100k for memorability).
(2) Faced with the suggestion of adding more words, they show that 4 is crackable. But go to 5, and you're solid for 400y. This is so clearly relevant that excluding it seems suspect. This article should be as simple as "use 5 random words, not 4".
Password managers have their own issues, both with usability and their tendency to provide centralized targets. And, they still require a secure passkey themselves -- for which a long random phrase is a perfectly reasonable approach.
That's a big IF. People should use a password manager, or create unique random passwords for every site... but they don't. Of course, if someone other than you chooses the words, you'd need to include the entire dictionary.
"But go to 5, and you're solid for 400y. This is so clearly relevant that excluding it seems suspect."
I haven't excluded this; I make reference to DiceWare at the bottom of the page. This covers both using more words chosen at random, which is the entire point.
Oops - I didn't notice this part was related because I haven't used diceware! I still think it's weird to put something this fundamental to your argument in a footnote under one product above, but apologies for implying you if ignored it entirely.
No, using 3 random words isn't "a really bad idea". That's complete nonsense.
The only "really bad idea" for passwords is password reuse. A unique 14 character three word password for each site you use will protect you from the threats you face online.
The fact that someone can relatively quickly crack your password if you used 3 random words is meaningless. That only works if they knew that you did that in the first place. Capitalize one random letter in all of that, throw a number between the second or third letter, add dashes, whatever and they're totally screwed. Game over, you win.
Even if they know it doesn't matter. Online password brute force attacks are extremely rare. If you don't reuse passwords then offline attacks don't really matter to you.
Password reuse, or even following a simple pattern for having different password for each site (i.e. a seed+site name) is the main problem here. If one site falls, then it should not be trivial to deduce your password or how to attack it for other, maybe more important or secure, sites.
So they brute force an account (a not insignificant effort and the attacker now sees your password of "fuzzykittytoes". What are they going to do with that information?
Target a bank with it? They can't bruteforce a 14 character password for the bank over the internet. What trivial method would you use to get into your bank account, if you didn't reuse that password?
agreed, the threat of online passwords is a break-in revealing the password database which happens to be in plain text 'ooops!'
and even if the password list is not in plain text, but the attacker manages to find your 3-word password, if you didn't reuse that password any where else, then it can't be used to get into any of your other accounts.
and for the account for which they found your password, they already broke into that system, so no additional loss here either.
You can't say "The fact that someone can relatively quickly crack your password if you used 3 random words is meaningless" and "isn't a really bad idea" in the same sentence.
"Capitalize one random letter in all of that, throw a number between the second or third letter, add dashes, whatever and they're totally screwed"
So.... ignore the "just use 3 words" advice completely and add random letters, numbers and dashes to make it secure? You'll find that's the advice we've given for years... and it too is ignored.
No it is not if you assume people who crack passwords are dumb and brute force everything. Second part you assume people are good at making random words on the spot, if you make password for linkedin, probably many will end up "linked blue somethig".
In reality brute force is last resort, you have dictionaries, walks on keyboard. You can also use additional data about your target if you have database like name etc.
On the other hand, if you use a password manager to keep track of all those different passwords, just make them random 128bit hex values, you won't need to memorize them anyway.
I'm not sure how this made it to the front page. It mashes up several things about passwords, some of which are dangerous, some of which are smart. It acknowledges the strength of a system like diceware. Then it concludes 'Don't use words in passwords. Ever.' while failing to distinguish between passphases based on words (like diceware generates) and having a password that is "benisgreat".
The ONE thing stopping me from using a password manager is fear that I'll lose the master password. How do I get over this fear?
Another thing is for example Chrome's built in password manager. If I make a bunch of accounts with these passwords, do I NEED Chrome to ever be able to log in to these accounts?
Write it down on paper, put it in an envelope, and leave it in your home, preferably in a small fire resistant lockbox that you keep other important documents in.
If your home isn't secure enough for this purpose due to roommates or other issues, ask a trusted friend, parent or relative to hold onto it and your other important documents on your behalf. If your situation makes putting it in someone else's hands a problem, you can split the password or other key material between multiple locations as well. 1Password makes it easy with the "rescue kit", and KeepassXC can use a combination of a password and file-based key for this purpose.
Security conversations on this topic typically go down some rabbit hole about the government, police etc. Keep it simple and don't crawl into that hole. End of the day, in the United States, the most reliable and legally secure place is your home or in some cases an attorney's office.
Safe deposit boxes are an option but are expensive, inconvenient and often less secure. A bank clerical error, fubar, or payment issue could result in loss of the contents at any time. (See: https://sacramento.cbslocal.com/2018/07/26/safe-boxes-stolen... )
When you need your password at 4:30PM, the bank is closed.
Also, it’s an awful place for important documents like a will, as the bank will require a court order for your family to open it upon your death.
If you're not using a password manager now, that implies you're already remembering at least one password. Using it on a regular basis is a good way to keep it in your head.
You can setup your password manager authorization to expire periodically (eg, every 30 days) as well as on reboot. With LastPass (probably others) you can also mark certain sites as 'high security' and require entering your master password again. I do this with a small handful of sites, including my bank and domain registrar. I typically log into my bank once or twice a month.
I very rarely reboot my devices (sleep + locked with a different password I have memorized and type in) so I don't get prompted for it that much otherwise, but enough that I have it memorized.
To keep track of your master password, just write it down and put it in your wallet/purse. We're used to and experienced with securing physical objects, and while it still offers an avenue to expose all your passwords, it requires a lot of extra work to do so.
Basically, storing your master password physically only really vulnerable to highly targeted attacks.
I would put it somewhere more secure than wallet/purse, unless you are forgetting your master password daily. I put mine in a place similar to a hidden folder in a filing cabinet. I haven't forgotten my master password yet, but I know that it's there if I need it.
I would imagine that anyone who is dedicated enough to violate your person to obtain your master password would also not hesitate to violate your house to obtain your master password (in fact, they may do so first).
I think the concern is more that it is a lot easier to accidentally lose a wallet or purse (or have it stolen for unrelated reasons), rather than a targeted attack against your person for the master password.
Well, the copy of it is just for backup purposes most times anyways, so the loss of the copy isn't going to be unrecoverable in most cases. And, as a bonus, you don't have to return home if you do forget it.
And, as I mentioned earlier, we're really quite good at keeping track of wallets and purses in the first case. Losing them is typically (at least in my own experience) a once-in-a-decade event.
Use the password manager for everything except your main email account. If you loose the password to the manager (which is also a good idea to write down somewhere), you can always recover all the other passwords individually as long as you have access to your email.
Use a master passphrase instead of a password. Make it quite long, and personal, so that you won't forget it easily. The longer you make it, the more "obvious" (to yourself) can you make it sound, without compromising its security. Something like "It was a sunny afternoon the day I met my lovely puppy Rocket". Just use your own words, and the chances of anyone guessing it will be lower than those of brute forcing it.
If you want to also have a shorter password for daily use, save it in a separate database protected with that long passphrase.
Use a password manager! The transition takes some time/effort, but once you have one fully set up (I adore 1password) it's both easier and far more secure than any scheme you keep in your head.
Re: losing access - a 1password family plan allows your family members to recover your account. Or just write your master password down on paper and store it somewhere secure - like a safe or safety deposit box.
Writing your password down on paper seems reasonable. Personally I use something that I deem to be secure, which is a reasonably long, but memorable (for me) sentence, using a pretty large character set. Feel free to Math me wrong on that.
I use LastPass (seem to be in the minority here - there are definitely some things about it that bother me; never tried 1Password). Password manager has simplified my life. Not much any more do I have to wonder which password I used where, etc.
I just remember two strong passwords. My master password for my manager and my gmail password. My gmail can recover pretty much all my other passwords. Also I keep backup secrets for my gmail as well. This feels robust enough, but backing up my password manager might be wise too.
Most services have a mechanism through which you can reset your password, usually email. Losing access to your password store isn't the end of the world.
Most password managers have an emergency kit, where you can print your master password and other info, and put it somewhere like a safe, bank vault, etc
i hate this kind of pedantry. it just has to be better than what people currently use, which is stuff like password1, or 123456. yes, in an ideal world we would all use password managers, but that hasn't happened yet, either.
Yep, exactly this. Perfect is the enemy of good. This is like the people who argue against your nontechnical relatives writing down their passwords. In reality, if they do that and have unique passwords for all their accounts, they're actually doing really well!
The base comparison seems to be a mixed character password with a length of 14. That's far more complex than the vast majority of passwords out there, so the simple three word password would be significantly more "secure" than the average password used.
Password managers are relatively easy to setup for technical people, but getting everything working on your home pc, tablet, and phone isn't a simple task for many people.
The threat 99.999999% of us face isn't that someone will bruteforce our password using a Cray. The attack that is currently occurring against you and the people you know is credential stuffing, which simply targets password reuse.
I'm sorry, but that's nonsense if you actually think about it.
Existing passwords are broken in a matter of milliseconds, perhaps a few seconds. Do you honestly believe it's sufficient consolation to an end-user to know it took 40 seconds instead? The end result is the same... they're breached.
My point is simple. We already have FAR superior techniques in 2018 and wasting time advocating questionable techniques is an exercise in futility. The people who follow the advice are still at considerable risk, the majority will ignore it and remain almost as vulnerable.
That's if they're encrypted in the first place. Avoiding re-use is far more important, as others have said in this thread. And humans are bad at remembering random combinations of letters and numbers, but ok at remembering words. If I could get my parents to use a password manager, I would.
But until then, three words is better than re-use, and your pedantry still isn't helping the people who need it most, i.e. people lacking technical literacy in the first place.
I'm guessing you mean hashed, or am I being pedantic? ;)
Anything is better than re-use, that's not really an argument. If I recommended using "1", "2", "3" for three different sites, that's technically better than reusing the same password... but still not safe.
Labelling it as pedantry really ignores the wider point. Having 3 unique passwords which take 40 seconds to break really isn't a great improvement, neither is it pedantic to point out that fact.
I feel like the real take-away from this article is that MD5 is broken, something we've all known for ages.
An application using a memory-hard algorithm like bcrypt would yield the same results, but without the hassle of retraining all users to use longer passwords.
Yes, that's pretty much the right take-away from this. The whole reason why key stretching techniques like PBKDF2 are so vital is that people simply cannot be expected to remember passwords strong enough to resist brute force at the speed we can achieve these days if they're just hashed with MD5, SHA1, or any of the other cryptographic hashes. That's true whether they're random words, random letters, or any other scheme you can think up.
Salts make no difference here. They do not harden passwords, but make an otherwise deterministic process random to ensure no two inputs result in the same output.
Of course, using a slower hash algorith will increase the time required to break it... but we know the majority of sites don't adequately protect passwords.
There are 331 breaches on HIBP. 108 used MD5, 43 used SHAx... meaning half of all firms used algorithms which we can break quickly & easily.
I think this misses who the intended audience of this simple password advice is. Hint: it's not us.
Normal people's passwords are terrible! It's the site name, their name, "password123" or a single word with a number on the end.
A lot of advice about passwords is aimed at getting normal people to make easy steps to make their passwords better, not perfectly uncrackable.
Complex passwords don't work for lay users and most of them aren't going to switch to password managers. So getting people to make small steps, like not reusing the same password and making it more than one word and a number, does make a big difference to users' security.
Outside of the Twitter-verse, who actually advocates to use a 3 word passphrase? Use 5 or 6 words and the problems listed here vanish. Also, use diceware or an equivalent generator.
If you're writing out a string of letters and characters, that's not going to be random either. The only way to generate a random password is to use a device that gives you random numbers. If you're relying on your working vocabulary and picking passphrases out of thin air, then your 12 character letters, numbers, and symbols isn't going to very good either, because you're not good at choosing things randomly and picking letters out of thin air won't be any better than words.
This entire article could be rewritten as one sentence: "Use a program or a dictionary/dice to generate your >4 word passphrase."
It doesn't even have to be truly random, just "random enough". If you were to use a passphrase like "Use 5 or 6 words and the problems listed here vanish", that wouldn't be random per se, but for an attacker it might as well be.
You can also, add irregular punctuation & capitalization, throw in some numbers within words, include words from other languages that you know or look up. Or use near words of other languages.
I have always thought of "3 random words" as a starting point for randomizing a password not literal advice to follow.
I wonder about that - given that a dictionary attack relies on known values (and can substitute out "l"'s for "1" and "!") - what happens if you write something like "Ger!bils"? Is the dictionary attack totally nullified, or is that possibly accounted for? That would seem to open up the potential space a ton.
Depends on the attacker, but it's probably accounted for at least partially. In some passwords where I don't care much about security I'll use this technique and some personal, site-specific mnemonic - if their logo reminded me of a gerbil, I might start with the name of a pet gerbil I once had and obfuscate it with character replacement or keyboard proximity replacement (same characters, one row up or down). These passwords are always unique to the site, which doesn't get any financial info.
When I need a secure password that's easy to remember I'll obfuscate the initials of a memorable passphrase from a joke, poem, song or book. My password manager is offline (remember the LastPass hack?) so I need to read from it once and type it in.
This depends on the cracker knowing that you only use words in your password. If they don't know that, then they're back to cracking it based on length. Or at least, attempting all of the words first, then brute force.
If you have n schemes that you’re targeting, that’s only a slowdown by a factor of n, because you share effort between the two schemes.
A cracker can try all the word combinations below a certain depth, then try all the passwords below a certain depth. Then increase the search space a bit for each, as long as they want.
That's fine if your only other scheme is "exact matching words". But once you break from that, even a little bit, the number of different possible schemes balloons quickly. For instance, 3 words + a symbol at the end. A cracker can try all the words, sure. But they're probably not going to try every different "all the words plus a little twist" schemes.
They are, but like—how many of these schemas is a cracker going to try between "test just words" and "brute force". There's a ton of possibilities, and I really don't think they'll bother enumerating them all. That's the point—the cracker doesn't know what schema you're going to use. Sure, maybe they'll test word lists. But not all of:
* word sym
* sym word
* word word sym
* word sym word
* sym word word
* word word word sym
* word word sym word
* word sym word word
* sym word word word
And then what about 2 symbols? Or separating words with spaces, or dashes, or underscores, or nothing? How many schemas are we up to by now? At what point is there no value to trying all of these, and just running the brute force lookup?
Right, but my point is that since the cracker doesn't know which you're going to use, they'll need to stick to broad schemas that will take a long time. It doesn't matter if you only use 1 symbol, if the cracker needs to try every combination of up to 5 symbols. And even a broad schema like you suggest—which would take a long time to complete—will have holes that would be easy to squeak through, which means even those broad schemas have minimal value.
Ultimately, even a schema like "sym? word sym? word sym? word sym? word sym?" is useless. They'll have to use brute force, and that means the difficulty of cracking your password is back to being based on the length.
since testing for 3 or 4 random words is a lot cheaper than testing all combinations, the word test can be used first. it doesn't matter that the cracker knows which method you use, only that they make a good guess.
1: try lists of known stupid passwords
2: test for random words
3: brute force all character combinations
there should be a few more steps available, but just to give an idea.
if your password is from category 1, you'll drop first, and only of it is a sufficiently long password from category 3, or a longer version of category 2 will keep you safe.
one more things: a password with 8 random characters, and a password with 8 random words are equally easy to remember. because it's 8 items.
8 random words are longer to type, but much harder to crack than 8 random characters.
I really do hate that many websites/services put a max length on passwords that is really short like 16.I remember even outlook.com restricting it to 16 chars,really bad for people who would use long sentence like passwords.
If that password was hashed with a single pass of vanilla MD5, the Jeremi Gonsey's cluster of 8 Nvidia GTX 1080i GPUs [2] would be running at 307,200,000,000 hashes per second.
In order to exhaust half of the keyspace, so odds would be in the favor of the password cracker finding the original hash, they would need to search only 140,737,488,355,328 hashes.
At 307.2 gigahashes per second, this would take approximately 458 seconds, or just under 8 hours using the Niceware list.
However, jumping to 4 random words grows that time by a factor of 65,536, which means reaching 50% exhaustion would take approximately 1 full year. Moving to 5 randomly generated Niceware words, and it's impractical to attempt cracking the MD5 hash.
Cherry-picking 3 words is a little dishonest for the discussion surrounding password security. The right "best answer" for password generation is to use a password manager, no argument there. And I don't know of any password generators that generate passphrases by default, Niceware, Diceware, or otherwise.
But if a user wants a passphrase instead, I don't know of a security expert who would recommend 3 words.
I must have read online for 2 solid years about how much better/safer and secure Password Managers are before I finally switched to one.
After switching though, the CONVENIENCE of a password manager is the most undersold part of it.
Nothing seems to be perfect solution with security, but if you're reading this and haven't switched to a password manager for whatever reason, security benefits aside I would highly recommend finally doing it.
Absolutely. The only issue I have is when I'm logging into something I log into every day, but other than that it's great to log into the manager and go "ah, here's the password I set up a decade ago".
What I don't understand is the math. Author makes the claim that using 3 words changes the combinatorics from 62^12 (62 characters, in 12 positions) to 20,000^3 (20k words, in 3 positions) but a hashing algorithm doesn't work with words, it works with characters, so if the words are 4 characters each, you've still got 12 characters to fill. Since an attacker doesn't know that you've not used symbols or numbers, they can't reduce the problem space to 26^12. Right?
Have I missed something in the article that would make the connection?
EDIT: Yes, hashing works with bytes, so technically, it can be even stronger if we include charsets from other scripts in the problem space.
Exactly. The attacker would have to try a word-based attack to benefit from the ~7 hour time-to-crack.
So I disagree with the article's advice: "Don't use words in passwords. Ever." Yes, you should use caution when using words in a password, but even if you use a password manager, a 5- or 6-word diceware password is ideal. Even better if you stick on a 4-digit numeric "salt" to your diceware passwords.
But yes, I do agree that a 3-word password is too short (~33 bits of entropy[1]). It should be at least 5 words (~55 bits). And you really need at least 6 words (~66 bits, obviously) for a master password.
1. Using EFF's user-friendly, ~1200 word list for diceware.
Imagine you got database with 100k passwords to crack. You want to crack as much passwords as possible in shortest time. You don't have time to go full brute force, and also you don't even know what was the length of password.
So first you get all dumb single passwords from dictionary (love, hate, fuck, password), if you have cracking rig or some beefed up server you will go through 100k passwords and all single words, starting with ones commonly used very quickly (sorry, but math I leave for others). Even though hash algorithm works on characters you just make hashes from dictionary words and compare to hash. You can go even further and have list of stupid words already hashed, so now you just need to compare prepared hashes (google rainbow tables), nice optimization. Then you can have improved dictionary with l33t sp33k.
Let's say you get 10% of passwords this way form 100k.
Now you see other people have stronger passwords, so still instead of going random generate combinations of 2 words and run through database, then you can go with combinations of 3 words. This will let's say get you another 20% of passwords.
There are many other optimizations you can come up with. Passwords that you did not cracked you just leave, you don't have incentives to run 12 character full brute force combinations. You have to get passwords as soon as you have them and start credential stuffing everywhere to get as much as you can. It is not some fun and games to crack your school mate password to post "I am stupid" on his wall.
I haven't had the time to read the full article, however I think I might have an answer to your question.
The entropy computes the strength of the algorithm used to generate passwords, not the strength of the password itself. So basically, you are getting the strength of the password if the attacker knows the algorithm.
Novice in the area, so grain of salt (and if you can correct me where I'm wrong, that'd be great. All for learning):
1) I don't think the base changes nor the exponent deviates from the character approach (in reference to the 20,000, 40,000, and 171,000 base stated). If we're in a system that allows all uppercase letters, all lower case letters, all special characters, and all numbers, then the base is the sum of those: regardless if my password is purplepenguinparade because you don't know that I've artificially set parameters for my password within the existing parameters. It could be argued you could try all permutations of lowercase characters first (base 26) but then how do you know to go to 26^19 before adding more to the base?
1a) if the hashed passwords give away tells as to what the decoded password is like (in this case 3 words lowercase), that'd seem like a concern about the encryption moreso than the password.
2) The 20,000 and 40,000 part seems like a meaningless piece of trivia. What if one of the words I want to use doesn't fall in your 20,000-40,000 word store? You're never going to crack my password. Better go the full 171,000 or whatever it was and hope I don't slip an extra character of the numeric, special character, upper/lower case variety.
-------------
While I do personally use a password manager, it isn't perfect for me either (requires loading 1Password 2x on Windows 10 for me because the first attempt does nothing, but loads on second try). I do hope more OS creators do like Apple did to better tie in with password managers and that may help alleviate some of it). So I don't see my parents or grandparents, for example, using one.
That said, for the rank and file folks that are taking their security advice from the government and police departments, I think pushing for various words in sequence is much better advice than "8 characters in length because it takes longer to guess than 6 characters" or whatever it is that people predominantly operate under and usually ties back to something about themselves that someone with some familiarity about them could guess (if I recall, this is what happened to Sarah Palin when she was hacked shortly after being McCain's VP nominee).
To your first point, I think it's about priors about password choices. To take an extreme example, even if a site allows up to 32 alphanumeric characters, we don't just say passwords on this site are uniformly secure under metric (26+26+10)^32 because theoretically a brute-force approach would have to go through up that many permutations. (Such a number is an upper bound on your password security.)
In particular, short passwords with up to 8 characters are weaker because they form a small and common subset of passwords that people often draw from, and attackers exploit that by trying short passwords first.
Passwords that are 3 dictionary words, with possibly some small perturbations like punctuation inserts or replacements (which helps but only expands the space so much), also form a relatively small subset of passwords that people increasingly draw upon, and attackers aware of this will tailor their brute-force search accordingly.
I might be misunderstanding your first paragraph, but I'm not saying that the exponent is constant, just the base.
What I was saying is if the rules of the site allow for all lower, all upper, special characters, and numerals, the sum of that is your base. If your password is purplepenguinparade, the expectation is you've cracked it by the time you've completed combined_base^1 through combined_base^19. They could artificially limit it (most people use lowercase characters and this site allows just lowercase characters, so let's try 26^n), but they'd run the risk of never cracking it because characters could've been added that deviate from their parameters.
Oops, that was my misunderstanding in the first paragraph :)
Maybe a clearer example w.r.t. the base^exponent value would be another highly structured class of long passwords, instead of short passwords. For example, those that are just a 2-character sequence repeated 20 times (e.g. abab...ab). You can still say that it'll take maybe up to 26^40 tries for an attacker to guess this, but that number should be more obviously too charitable, since less entropic classes of passwords are more likely to be guessed first.
More generally, this kind of counting analysis depends on your choice of attacker, and it's plausible for attackers to be more sophisticated than just trying all the strings in lexicographical order. A more concrete but still practical attack sketch adapted to the frequency of password schemes today could be: guess dictionary word combinations first, starting with common/short words/phrase lengths, then repeat with small perturbations, then more perturbations, and so on until you search all the remaining strings.
Then, the number of tries for such an attacker to find a 3-word passphrase with a few changes would be much closer to (some small constant) * (# dictionary words)^3 than (# characters)^(string length). Of course, you can still say that such an attacker would take at most (# characters)^(string length) tries to get it, but such an upper bound isn't as useful when the password is much more structured and easy to guess with a slightly more sophisticated attacker.
(Yet another way to put it: one shouldn't expect a password that's 'slightly out of reach' to be significantly more secure than a password that follows a scheme exactly - a more sophisticated attacker would test neighbouring passwords as it brute-forces the combinations in the scheme)
I wrote a perl script a while back to create random passphrases based on a random mnemonic word
It seems like a good idea to me but I’m not knowledgeable enough about cryptography to know if that’s really true and would be very interested to hear from anyone who does know
While I use strong passwords (better safe than sorry), how important is it really to worry about this kind of offline attack if you don't reuse passwords?
I believe it comes from a good place, but this post is bad.
Why?
Because it doesn't apply the same comparison standard to character-based passwords than to word-based passwords.
Sure, we only use a few common dictionary words. But by the same token, our character-based passwords aren't random strings either: they are variation based on random words.
Notice Randall gives each word 11 bits of entropy, meaning he considers each word is chosen at random amongst a pool of ~2000 words. Yet that base is still much bigger than ~64 for characters.
If you take 3 common words, you get ~8 billion possibilities. 4 common words: 17,000 billion. That's much more than than ~400 billion for a perfectly random (again: that's not the case in practice) password of 12 characters. And four words is easy to remember, 12 random characters is HARD.
If anything bad can be said about the xkcd comic, it's that it doesn't hammer the point that it has to be FOUR words. As three is much less secure (but still probably more secure than the 12 characters password people are likely to use in practice).
This post also fails to take into account people's exposure to fiction. My vocabulary may be limited, but many of my passphrases use nouns/concepts/slang that are drawn from video games, novels, movies, etc. Some of these works (authored by family, friends, and myself) aren't even published. I reckon the chance of these "words" being included in some cracker's dictionary is quite close to zero.
Also as you point out their analysis of the entropy of random words is quite uncharitable. It fails to account for any sort of punctuation, capitalization, etc. Is my password "correct horse battery staple" or "correct Horse, battery staple!" or "CORRECT HORSE! Battery staple.", etc. There is a lot of entropy inherent in crafting a phrase.
(1) Your working vocabulary of 20k words is irrelevant. If you pick the words yourself, they aren't random and your random-word-vocabulary (the ones you'd actually pick) is waaaay less. If you use a randomizer, which you should, just give it the full dictionary (or say, top 100k for memorability).
(2) Faced with the suggestion of adding more words, they show that 4 is crackable. But go to 5, and you're solid for 400y. This is so clearly relevant that excluding it seems suspect. This article should be as simple as "use 5 random words, not 4".
Password managers have their own issues, both with usability and their tendency to provide centralized targets. And, they still require a secure passkey themselves -- for which a long random phrase is a perfectly reasonable approach.