Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been using phrases and sentences as passwords for a while, and I've found that there are 2 main problems;

1) A lot of sites, still in this day and age, have max password lengths, so I still have a lot of short passwords. Usually this is bank sites and the like.

2) Password entry fields are often very short visually, and with a long password getting lost is much easier. I find I have to type them over A LOT.

The second is actually the more annoying problem.



These are the real issues with this. Banks seem to be borderline idiots when it comes to password security: case-insensitive, no spaces, 20-character max, small choice of "special characters". These are from Amex, who's password requirements sadly were even worse a few months ago.

With crappy password requirements, it's impossible to use decent passphrases. Getting locked out of your account for 3 failed attempts at typing a 30-character password is pretty obnoxious, too.

In situations that allow passphrases, you don't need a password generator like this. You can grab a sentence from your favorite book and use it. e.g. "How do you do, Miss Doolittle?" That's not the best choice, but it's still got way more entropy than a standard password, probably a lot more entropy more than you'll get by choosing a 4-gram composed of words from a corpus of 2k, and it's easier to remember.


It turns out that you are mistaken.

Your favorite book is almost certainly chosen from the 129 million books that Google knows about: http://www.fastcompany.com/1678254/how-many-books-are-there-...

That gives you 27 bits of entropy.

The average book length is probably not over 400 pages. An average page probably doesn't have over 25 sentences on it. So the whole book contains only ten thousand sentences.

That gives you 14 more bits of entropy.

The total is 41 bits of entropy. This is one-eighth as secure as a 4-gram composed of random words from a corpus of 2k, if we measure strictly by entropy.

The situation is actually much worse, though: your favorite book is probably a popular book. So the number of bits of entropy provided by the choice of book might be a lot smaller than 27. I would guess that it's perhaps 10.

And many of those 129 million books are not very different. They contain quotes from other books, reprinted short stories, folk tales, set phrases, and so on.

In practice I think it might be difficult to mount a password-guessing attack using the Google Books corpus, because it's hard to get access to that corpus. The Project Gutenberg corpus would not be so hard.


Of course, the flip side of this is that we're veering off into attacks where you're targeting one specific person and know a bit about how they've chosen their password.

If you want to mount such an attack, fine, but most of us are dealing with the much-more-common threat of someone who gets a file or a database of hashed passwords and wants to crack them all in one go.


The analysis applies to that threat as well; it just adds some constant number of bits of entropy.


That's an interesting analysis. I can't really see any major deficiencies with it.

On the plus side, a sentence is probably going to be easier to remember than 4 random words. Personally, I draw some of my "high-security" passwords from literature, but then I modify the case and do the "leetspeak" character substitution, so a naive sentence attack would not work. A more clever one might, though.


Edit: Oops, as dpark points out, I swapped two digits. My apologies. Below, my original, erroneous comment.

41 bits of entropy means you have on the order of a one in 10^12 chance (2^41) of guessing it, and 2,000^4 is on the order of 10^16. So how is the former "one eighth as secure" as the other? Wouldn’t it be 10^4 times less secure, that is, 10,000 times more likely to be cracked?


2000^4 is on the order of 10^13. Specifically it's 1.6E13. Maybe you accidentally swapped the 1.6 and the 13?


The attack you describe is easy to defeat by making a small modification to the selected sentence.


If you choose one of eight small modifications to apply at a randomly-selected character, you get perhaps 6 bits of entropy from the choice of character and 3 bits from the choice of modification. That's better, but adding an extra common word to the end of the sentence would be better still.


Don't forget sites that require: "your password MUST contain at least one number, one uppercase letter, and one of the following characters: !, @, #, or $, but not %, ^, &, or *". I slap my forehead at how counterproductive these requirements are.


This is why, for my lab's password changer, the requirement for short passwords is simply that it must have one upper, one lower, one digit, and one none-of-the-above (and be at least 8 characters).

If you have a long password (at least 16 characters), all other requirements are waived so that you can use passphrases.


Wow, sanity in password requirements? Do they also avoid the silly mandatory 30-day password change?


I hope so; that's annoying and counterproductive.


Forcing one or more digits has little value. You are better off with 1 uppercase one lower case and 2 non alphabet characters. (Users are very likely to be replacing a letter with 1,0 so 2options * 8posistions = 16 possibility's = fail.)


Which is exactly the sort of terrible rules xkcd is criticizing (paraphrasing glenra).

Instead of 4 extra enforcements you could add 8 extra characters.

Your entropy is (somewhat simplified)

One 8 letter word: 15 bits

1 uppercase = 3 bits (or even just 1 bit, people capitalize the first letter)

reversing 2 rules above: 1 bit

replacing two characters at random places: 8*7/2 = 4.8 bits

inserting 2 random non alphabet characters: 40^2 = 10.6 bits Total: 34.4

The entropy of three medium difficulty words is log(4000^3) = 35.9

Instead of memorizing K!ybo4rd it could be mykeyboardisblue.


The requirement for many of my website is simply that it "must not consist solely of lowercase letters". (as well as a minimum length).


>must not consist solely of lowercase letters

Which is exactly the sort of terrible restriction xkcd is criticizing.


A space is not a lowercase letter, so the xkcd password would pass my test.


Then the space would be "the obeisance to the stupid website piece". Note that the entropy of "correct horse battery staple" is only one bit more than "correcthorsebatterystaple".


Yeah, a lot of my passwords look like "securesecretpassphraseA1!"

There's the secure piece, and there's the obeisance to the stupid website piece.


I have a couple of domains registered with 123-reg.

To prevent unauthorised access to your account your password must contain 8 characters.

Wait, what? They're right, too. You can't have 7 characters and you can't have 9.


Yes a friend was complaining about that recently.

It's a bruteforcer's dream.


Hi

I work on behalf of 123-reg.

We are working on changing this in future control panel updates.

Regards,

Ricky


What could the reasoning behind those requirements possibly be?


Usually the symbols involved are used by SQL or some other layer, and the programmers insert the password directly into the query string because they don't know any better. This leads to SQL injection and other issues.

So rather than discovering the correct way to do things, they try to prevent you from using any characters that might be involved in an SQL injection.

In some cases the guys on the backend know what they're doing, but the requirement can still be passed down from on high from some manager who absorbed the practice from another project.


If anyone knew what they were doing the uncrypted password would be nowhere near a SQL statement.


They're trying to force users to use those characters in an attempt to enlarge the space passwords are drawn from. It doesn't work very well, of course. Instead of "password", you just get "Password1!". That said, I might make the same choice (for short passwords) if I were implementing password policy.

Edit: If you meant the "but not %, ^, &, or *" requirement, that's an indication that the devs don't know how to use prepared statements or at least escape properly.


Those requirements are there for the people who try putting just their name or "password" or their 4 digit ATM PIN as their password. For very short passwords, only having alphabetical (not even alphanumeric) passwords is terrible. Those requirements are there to prevent some really stupid passwords.


Covering your ass by disallowing passwords like "password".


No, I meant specifically why they would allow certain special characters and not others.


Also annoying is that a lot of sites require gibberish. Apple requires at least one uppercase, one lowercase, and one number. Some sites require a symbol as well.


Especially if you are logging into multiple systems regularly using domain credentials, it rapidly becomes apparent that the faster and easier the password is to type, the better. I've found that some passwords with symbols and numbers just roll off the fingertips with a little practice, others not so much, but longer passphrases are for some reason the worst.


This. My password is not a word, not even a word with substitutions, but it is optimized towards typing it on a keyboard (in terms of when caps come in, when numbers are added, switching hands, etc). I can knock it out in a second and it's muscle memory with zero risk of forgetting. correct horse battery staple, not so much. I lose some entropy by making it typing-friendly, but the cracking algorithm to simulate that would be pretty difficult. I'll take the loss.

As an aside, 1000 guesses a second? Seems generous.


Very few sites have a short max password length. I use 1password, and of the 63 sites I've stored passwords, all but 2 allow 25 character password lengths. Ironically, my Bank only allows me 15 characters.

I haven't typed a password in 3+ months - don't know what any of mine are anymore, so I find typing is no longer an issue.


I really like using 1password. I have a long passphrase as my unlock key, easy to remember, then do the randomly generated codes as long as is feasible for each different service.

1) Nothing written down 2) Unique per service 3) Adjustable difficulty & char set per service, to match their stupid requirements.

Seems like the best of several worlds.


I just use a password manager like laspass or 1password. If it has submission automation, you don't even need to type your password.

Just choose a nice strong master password.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: