That may have been the case in 2011 when that article was written, but I have just tested this now and my facebook does not accept my password with the case reversed by having the caps lock key on.
iirc, on mac capslock works like (caps OR shift) and on windows its (caps XOR shift), that is, on mac, with capslock on its always cap, but on windows its just a toggle.
Is this implemented by Facebook holding 3 hashes of your password? It doesn’t save your actual password clear text (or encrypted clear text), does it?
A related question: when a password system tells me I need to change my password, and it has to differ by 3 letters from my previous password, is that system storing my password text rather than the hash of the password? Is that safe?
They wouldn't have to store 3 hashes, would they? They could just get the hash of each of those transformations, e.g., reverse case, get hash. If the transformation make the incorrect password into the correct one, it will match the original hash.
You can also normalize the password, e.g. always make the first letter lowercase and reverse the case of the rest if the second letter is uppercase. Then you only have to hash that.
I hope Facebook passwords are limited to US ASCII, because I seem to remember that there are country specific conversion rules for various Unicode characters that may or may not be subject to change, not to mention the lack of 1:1 mapping for case conversions. Example the German lower case ß converts to SS, so does ss. Of course they also created an upper case variant ẞ of ß a few years ago so who knows what mapping any software will use.
Going further to avoid collisions that could happen between words like massen and maßen when upper cased the rule of thumb was to convert ß to SZ when that happened, so getting the correct upper case would have also required a full German dictionary.
TL;DR: Upper/Lower case conversion is complex, avoid it if possible.
Small addition since I completely forgot it: Since here the attempt is to fix caps lock issues the upper case of ß also depends on the users keyboard layout, for my keyboard ß would map to ?. Does anyone even have a list of all the keys that map to ? in different keyboard layouts?
Correct me if I am wrong but this doesn't seem to allow case conversion:
> Case Mapping Rule: There is no case mapping rule (because mapping uppercase and titlecase code points to their lowercase
equivalents would lead to false accepts and thus to reduced security).
That only works if they have the same insensitivities across time and platforms. In this case, they still need to preserve the original information, because they're case sensitive on that first transformation for non-mobile devices.
No, it's only one bit for the first character and one for the second. The case of every other character is maintained relative to the second character, so the parity there provides the one bit of information for each subsequent alphabetic character.
why would you have to retrieve multiple? could you not calculate the 3 hashes, and then do SELECT WHERE pass = HASH1 OR pass = HASH2 OR pass = HASH3? You don't care which one was correct just that one is.
By retrieving three you can perform an if on one value; if that fails check the other two values. This allows you to save the other two calls for most logins.
You can push the comparison into sql but you still have a series of retrievals and comparisons. Just because it happens in sql doesn’t mean the processes don’t have to happen. In your example you have calculated three hashes, then pushed them to the sql server where the retrieval and comparison occurs.
Passwords are only verified on login. Does it seems reasonable that there are millions of logins to Gmail from mobile devices every second?
Back of the envelope: 2 million logins per second would mean about 170 billion logins per day. With 7 billion people on the planet, that'd mean about 25 logins per day from each man, woman and child.
Instead of just telling the parent that they're doing something can wrong in a condescending way, can you explain what it is they should do differently? At least a link to an article that explains this?
good lord, why would you ever expect psuedo code to be my level of understanding of how to store a password. i don't ever store passwords. hashes only.
Because the pseudo code looks quite bad? The clause should pick the user not the password or hash or anything like it. The hash (and possibly salt etc) should come back via the selected column list. The other way round is inviting trouble and could indicate a poor understanding, though I agree they shouldn't be so snarky without some explanation.
Unless it has been edited since i saw it, the pseudo sql doesn’t select anything, a logical assumption is user identity and not needed. The comparison is between the original hashed password and the hashes made at auth-time. The name of the original is “pass” but since it wouldn’t make sense to compare a plaintext string to a hash another logical assumption is that “pass” is a hash.
Maybe these generous assumptions about someone’s pseudo code are unwarranted?
Jeebus, it's just meant to show that you could do a select in one go without having to do them one at a time cascading to the next one if no match. I don't know what you need to select, that's up to the reader. That's the point of psuedo code. You saw select and made the connection to "it's a database query". Boom. point made. Again, I understand the concept of user provided pass and a hash with a salt. If you can't really put 2+2 together to see that you're taking the 3 different options then I'm sorry for you.
Look, whatever it’s meant to show, it also looks like a security nightmare that implies a lack of understanding. Are we supposed to ignore that because it’s pseudo code? Clearly some of us think not.
It’s not the select that’s the problem, it’s the clause, so your explanation also seems to imply that misunderstanding on your part is real.
You have to calculate at least one hash and you only have to perform one query (as it can grab multiple hashes at once), and that query should not choose what to return based on the content of the hash(es).
Pseudo code is supposed to strip away details that might distract from fundamentals, yet your pseudo code and subsequent replies suggest that your understanding is contrary to the actual fundamentals of checking a password securely. Start with limiting the set by choosing by user, never by hash.
Jeremy Evans goes over many of the fundamentals[1], including why restriction of the selection is important, and why restriction of access to hashes (i.e. not sending them from the initial machine) are important. In his own framework (Rodauth) he doesn’t even allow selects of the hashes to be returned to app, let alone used as part of the where clause. Note the clause in each of the functions he defines (12:53 and 14:05).
> The comparison is between the original hashed password and the hashes made at auth-time.
Didn't I write that this shouldn't be done via the clause? I haven't edited my comment either so it should still be there and I see it is.
> the pseudo sql doesn’t select anything
It should select the hash(es) and bring them back to the app for comparison.
> The name of the original is “pass” but since it wouldn’t make sense to compare a plaintext string to a hash another logical assumption is that “pass” is a hash.
It doesn't matter whether it's a password or a hash, the form of the SQL statement is going to cause trouble and should be the other way round.
> Maybe these generous assumptions about someone’s pseudo code are unwarranted?
You can't compare hashes like that unless they're not salted.
The same password won't hash to the same thing without the same salt so you can't compare them like that.
(If you could, then you would notice multiple users with the same hashes, i.e. the same passwords).
To verify a hash you need to retrieve the user's salt (typically stored with the hash the algorithm in a single string) then re-hash with the same salt.
More importantly if the server just accepts hashed passwords and stores them, then if you got ahold of a hashed password through a leak you could just use it directly to authenticate by modifying the client. The hashed password just becomes the password with one extra client-side step that you can trivially skip.
Salting is more about making it non-obvious which passwords map to which hashes so you can’t easily build tables of hashes for common passwords.
Sending the password to the server in “plain text” is fine over https, it’s a secure channel. Hashing isn’t meant to hide the password on the wire, it’s to prevent anyone with access to the database from learning what the passwords are.
The server could hash again the hashed password sent by the client. Especially if the client use an insecure hash algorithm (no secret salt for example).
I feel like if the client always hash passwords as soon as it is typed (the javascript never sees the unhashed password), no one would notice. (except some with crazy password rules that would disallow a hash-looking password)
The various ZKP approaches are considerably more complex to implement properly vs the trivial approach of a client side hash. There are obvious tradeoffs, of course, but I wouldn't fault someone for an additional hash step on the client.
Hashing on the client still seems redundant though. In the end, whatever value is sent to the server is essentially plaintext, because it's all an attacker needs to know to authenticate. Whether it's the raw text the user typed or some transformed version of it isn't really relevant.
In a world where password reuse is rampant, whether it's the raw text the user typed or a hard-to-reverse transformation on it is absolutely relevant to the user, just not to the service provider.
Double-hashing (peppered on client, salted on server) does have a modest benefit: the passwords are no longer sent in plain-text and cannot be cheaply intercepted by a passive eavesdropper (i.e. without observably tinkering with the data sent).
This often isn't considered worth the accessibility and maintenance costs of requiring the user to compute a hash (the threat model isn't exactly hugely concerning, especially to service providers, and is mostly obviated by transport encryption anyway) or the risk that somebody's going to come along and ask why we're hashing twice and rip out the server hash (very bad), but calling that "no benefits" is more or less lies-to-children.
When you change your password, you're usually required to enter both the old and the new one. This is when the check is usually performed.
What I'm more worried about is the system that some Polish banks use, called masked passwords over here. With this system, you're only required to enter certain characters of your password, but the set of required characters changes at each login. This exists to make key loggers much less effective. There's apparently some hashing going on (something to do with curves and polynomials), but I couldn't find more details when I last looked.
Hopefully the bank stores a separate hash for each mask, generated at the time of password creation. Otherwise, it’s hard for me to imagine how this would be possible without saving the password in clear text.
If someone steals a hash for characters 1-4 they'll be able to brute force it. Only 10000x the cost of a single login. And then if you have the hash for characters 2-5...
Is this still the case? When living in Poland 15 years ago I had an account with WBK and they had this idiotic system where I had to write my password on a piece of paper with numbers below to be able to give the 3rd, 5th and 11th character. Goodbye password managers (at least the automatization part).
Then, suddenly, they got back to a normal login and password (I think I had the choice IIRC) but then I left the country.
Poland is a beautiful country, I lived in Krakow for a few years and it was A-WE-SOME.
Note that the client would only need to do this on a failed attempt.
So if i typed "Password" on mobile. The client would first send the request as "Password". If that succeeds, then no worries. If it fails, then the client could send a second request by reversing the case of the first letter. In this case, it would send a second request for "password".
At most, it is 2 login requests per password. Many other commenters here are incorrectly stating that 3 requests would be necessary, but this is untrue. A letter can only have 2 possible cases (uppercase or lowercase). So the client sends the originally typed one, and if that fails, then it flips the case of that first letter. That is the only alternative. There is not a third option.
A well-built login form would restrict users after 3-5 login attempts anyway and require a password-reset process. So that is 6-10 client requests to the backend (n * 2). That shouldn't be hitting any sort of rate limit.
It would only half the rate limit, but any real brute force attempt requires way way more than what a normal human would try. Something like 5 attempts would double to 10 in the backend, still nowhere enough to bruteforce, but enough for human trial and error.
To your related question, on the rare systems where I’ve had to do that, I’ve always (to the limit of my memory) had to submit “old password” “new password” “new password again”, so the old password would be available in plaintext client side (and still able to be verified server side after hashing).
I’m sure it’s client side and it just retries the different ways if it encounters a password error. Much easier and more secure than 3 versions of the password.
Yeah, I noticed this since I have two accounts with emails that are one character apart, and I was unable to log into one of my accounts because of this feature…
They might do the same stupid thing Gmail does, and ignore certain characters. My Gmail is "first.m.last@gmail.com", but I constantly get mail from idiots who don't know their own email address, and use my "firstmlast@gmail.com" to sign up for things. This problem would go away entirely if Gmail didn't do this. Facebook might do similar things to make it "easier" to login, even though there are security implications.
That’s by design though, the .‘s are optional. You can add more even. Also plus routing: first.m.last+whatever@gmail.com also routes to the same email.
“ Gmail doesn't recognize periods as characters in addresses -- we just ignore them. For example, you could tell people your address was hikingfan@gmail.com, hiking.fan@gmail.com or hi.kin.g.fan@gmail.com. (We understand that there has been some confusion about this in the past, but to settle it once and for all, you can indeed receive mail at all the variations with dots.)”
> This problem would go away entirely if Gmail didn't do this
No it wouldn't. The problem is that people believe they have addresses they don't. They don't have firstmlast@gmail.com any more than they have firt.m.last@gmail.com.
I have a surname@ address, and I receive similar mails all the time. People just simply assume they have my email address. No dots involved.
It would go away for me because no one ever uses the real first.m.last address; all the misdirected mail is from people who think firstmlast is their email address.
It seems to me that the third case could easily be subsumed by first transforming the text by inserting a "change case" token in front of any character with a different case than the previous character. In such a representation the original text and the "caps-locked" text are indistinguishable.
This is definitely a feature and not a bug. I was a little worried when we implemented this 5 years ago, but it turns out there's no real security risk here. My app was a financial app and so many people had trouble logging in on mobile that we basically had to implement this.
Many people note their passwords down in eg. a text document. Not a great practice, but password management is a pain for most people. So when they do that, their editor might auto-capitalize the first character.
No, even if you fix it... it will went wrong on the moment you click enter to save.
Some mobile keyboards are just dumb enough to fight against users even they explicitly change it back. Some will have a option to disable it, but some probably won't.
No. What happens is people fail to login several times in a row and complain about it. This approach changes the security of a password basically not at all while reducing the number of people who become aggravated by not being able to login because they don't realize their capslock is on.
Most apps use the appropriate type of input field ("password") which the browser or mobile OS recognizes should not be capitalized. But some apps use a normal text input (with some masking/styling to make it look like a password field) which a mobile keyboard will normally capitalize. I get very annoyed when I see the latter.
As someone else mentioned, it's dependent on your choice of keyboard app on Android. The problem of having to trust another vendor is probably why my financial institution provides its own in-app keyboard for its apps.
With over 1.5 billion active users... 6.5% is still a very large number of people!
Also.. not sure what safari/iOS did in their early years with keyboard password entry capitalization... but if they did auto capitalize... since Apple is so good at saving profile info across new installs/os updates... I imagine there would be a large portion of old apple users with perma-capitalized passwords out there as well.
iOS has never auto-capitalized type=“password”. It’s absolutely bonkers to me to imagine any mobile browser ever doing that, I was assuming some websites were reimplementing password fields badly with type=“text”.
It’s both a feature and a bug (well, a compromise with associated security risk). The feature is as you describe (and probably mitigates a lot of unnecessary account locks). The bug/compromise is that intentional or automated casing to achieve password complexity is a little bit less effective. I think the benefits of the feature outweigh the security risk, but it should also be widely disseminated, and anything dealing with passwords should take this into account.
if your pw was "abcd", then "Abcd" should work, right? Because it auto-capitalizes the first character of the first word. If your pw was "ABCD", then "abcd" (all-small) should be accepted because it auto-capitalizes and then you press the capital key once for caps, so it becomes all small.
I can only speak to iOS here, but if you focus any field with auto-capitalization and immediately touch “shift” it just switches to lowercase, not caps lock. The only way to engage cruise control[1] is with (timing-configurable) double-tap.
Some other sub-comment mentioned that Facebook used (I think they no longer that according to another comment) to account for first letter being capitalized, and for reverse case. For example a&Ope$G would be accepted in its original form, as well as A&Ope$G and A&oPE$g
A user was having a really bizarre problem: They could log in when they were sitting down in a seat in front of the keyboard, but when they were standing in front of the keyboard, their password didn't work! The problem happened every time, so they called for support, who finally figured it out after watching them demonstrate the problem many times:
It turned out that some joker had rearranged the numbers keys on the keyboard, so they were ordered "0123456789" instead of "1234567890". And the user's password had a digit in it. When the user was sitting down comfortably in front of the keyboard, they looked at the screen while they touch-typed their password, and were able to log in. But when they were standing in front of the computer, they looked at the keyboard and pressed the numbers they saw, which were wrong!
I just want a phone number input box that will strip dashes for me.
Many go to the effort of having an error message pop up that says "no dashes or parentheses allowed." So they went to the effort of writing special case code to notice and handle this ... by giving instructions to the person, instead of the computer.
This is like when on a cli application -h displays a hint that you probably meant --help (or the other way around). If you already know someone wants to display the help, why not just display it?
This is different, there is no special case handling here for you typing "exit".
Python functions are invoked with parenthesis, while typing a name without parenthesis retrieves the content of a variable. The Python CLI helpfully sets the "exit" variable to that string so that you don't get a confusing NameError when you make this mistake.
If the `exit` variable were set to a string, `exit()` would be an error because you can't use `()` on strings. `exit` is a callable whose `__str__` method returns that message.
Note that this isn't specific to the REPL. Running `print(exit)` in a Python script will print the same message.
I'd say that would be much more surprising and unintuitive behavior just for the sake of slightly more convenient REPL use. I wouldn't want stringifying any function to automatically call it. What if you store the function somewhere and print it for debugging, and then have to figure out why your program keeps crashing when you try to just print a list of functions?
Besides, you usually have a more convenient exit available with Ctrl-D anyway.
Yeah but I can also say it’s a surprise to see instructions on how to exit the REPL when printing a list of functions. Honestly I think there should be no special case, and the REPL should print the “how to exit” text upon startup. As is, the user still has to guess ‘exit’ rather than ‘quit’ ‘abort’ ‘stop’ ‘bye’ etcetera to get the help text. (edit: actually they have the text on 'quit' as well)
I'd love to find (never looked...) a python3 repl where `print`, `dir`, `help` all behave like python2's `print`, since they're debug/lookup tools. It's rather often I'll open a terminal and want to check one of those things, and... typing () characters just adds significant effort (for lack of better description).
$ bc -l
bc 1.07.1
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006, 2008, 2012-2017 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
exit
0
quit
Using Ctrl + D to exit a shell and Ctrl + C to interrupt/break off the current line is a rather strong convention. Bash does this too, and many other REPL's and shells (Ruby's irb for one).
A lot of technical folk who rely on this immediately notice when some shell doesn't do this (exiting instead of breaking of the current input line). I have this with Hbase's hbase shell. I've dropped out of that by accident dozens of times because Ctrl + C interrupts the whole shell.
FWIW, Ctrl+D is not "magic". It works because Ctrl+D is EOF. Applications that read input should expect that the input is done when they encounter EOF.
ctrl + c is accepted, it means "interrupt the current operation", like it does if you are in bash.
If there's no current operation, Python has no way of knowing for sure if you intended to exit, or if you intended to interrupt an operation, but the operation finished before you pressed the key combo.
Python's behavior here is good UX. A key combo that does two different things depending on the current state of the program, and the program's state is changing right in front of you... that would suck.
I like this feature because I'll use ctrl+C to cancel the current line I'm writing like I would in a terminal. Psql does the same thing, but I started using mysql recently and I keep getting booted out of the app because I want to cancel the current line ):
Last time I complained about something like that https://news.ycombinator.com/item?id=27951099 (it's okay to quote myself, right ? I am allowed to ?) I was told it's a UX FEATURE and apparently some people like to be treated like that when interacting with computers. ¯\_(ツ)_/¯
While I agree with that usage in the example you mentioned in your post over there, I don't think it applies here.
Some applications use -h and some --help. Many support both. So unless we get all software to agree on one standard here, expecting your user to remember which one it was is actually bad UX in my book. Best is to just support both, so people don't have to think about how to get the help.
Displaying a notice when some option changed makes totally sense tho, but -h/--help is not that kind of issue.
But the changing isn't user visible. To the user, the phone number is the string "(416) 555-1270". If you want to store it differently in your database go ahead. But to the user, the phone number has dashes.
In fact, on my phone, when I type in just the digits, my phone inserts the parens and dashes. Presumably, users consider this easier to read, and dare I say, more canonical.
So, many applications can't handle phone number input in the exact form they display it to the user.
I was convinced that this already existed in the form of masked inputs, but after trying I realised those are broken in mobile Chromium and utterly unusable in mobile Firefox. As far as native components go, I think most platforms actually have a prebuilt control for this that works pretty well in my experience.
You'll probably need to do more than just strip dashes, though. You also need to strip spaces and parenthesis but keep other characters such as + and ~. Many prebuilt phone number input boxes have trouble with even normal American phone numbers, let alone foreign phone number systems. Even if you're only targeting American customers, you'll probably need to support the phone number format for a visiting foreigner as well.
With something as complex as phone numbers I'd just stick to using standard form validation code (like in the HTML standard) to warn users of explicitly invalid input like letters or most special characters and storing phone numbers as a 20 character random text strings for all other purposes. Phone numbers are like time zones, you'd think they're easy to deal with but they're surprisingly finicky to get right.
Yup. My role of accepting phone numbers is `input.replace(/[^0-9+]/g, "")`. It might strip some expected information in rare cases but good enough for me.
This works for a lot of other things that people format wildly like Canadian postal codes (which are A1A 1A1 format but many places require presence or absence of a space), credit cards (strip the spaces) and so many other fields.
Why not simply leave it as the user input the value? Validation is one thing, but silently dropping information cannot possibly be helpful for the person that then has to call this number.
I agree it should work for any phone number I've ever encountered, but just why
Will it work for the convention in the UK of writing +44 (0)1234 567 8901 which says to use 01234 dialling code inside the UK or 441234 if calling from another country, and don’t dial 4401234 ever?
A truly global phone number regex is quite literally impossible to make. There are too many combinations and expectations built into these conventions. You listed several here.
The "best" solution is to separate country code into a different field or input. Then have everything other than the country code (generally called a "subscriber number") added to another input.
Then on the backend you would essentially strip out all the non-numeric characters from the subscriber number and combine the country code and stripped "subscriber number" into an E.164 format number and store that in the database.
(Source) I have spent a decade dealing with phone numbers in databases and web forms. This is the "best" way to handle it, and even it isn't bulletproof, but it works 99.8% of the time. The best way to handle the other 0.2% of cases is to make a descriptive error message that explains to the user how you are expecting them to input their number (ie. No extensions, etc).
This is one of the reasons why I generally don't like `maxlength` on telephone input fields: browsers will truncate pasted-in phone numbers with nary a peep to the user as to why.
Validation with onblur or submission is great, but changing my input makes me angry.
Using letters in a phone number as a mnemonic, or just to flow better, is not a "rare case." You should revise your regex to accept letters, then whether you store them internally as letters or convert to the corresponding numbers is up to you.
That makes sense if you are making a contact book or social profile. I was focusing on the case where I want the number as contact information. I want to validate up front that it is a number that I can actually call and don't really care about how the user wants to format it.
I am letting you. It is just when the number is sent to the backend it is converted into a usable format where I can for example generate tel: links and otherwise use the number. I agree that there is some value to being able to each your number back how you typed it, but I'm not sure it is more valuable then showing you the number how I am actually going to try to call it.
Sorry, sounds like a misunderstanding. I always meant to accept anything, that is why I put a replace in my top comment not saying that I would reject the form.
I thought the same, and was surprised to find the problem here seems to be that there's no part of the HTML spec to set the allowable characters in a text input.
So JavaScript to intercept keypresses or postprocess the string is risky at best and often poorly implemented.
If it was in HTML it could be reliable, and have a unified behaviour when text is pasted in.
For phone numbers there is a "tel" input, so the undesirable behaviour your experiencing may be in spite or because of it.
I'd need to double check, but I was under the impression that it affects a validation check, but that it didn't actually prevent the input of these characters.
You can catch the 'invalidity' of the input with the `oninvalid` JS event, then use that to `e.preventDefault()` and show a message as to why it it failed.
>So they went to the effort of writing special case code to notice and handle this ... by giving instructions to the person, instead of the computer.
Personally, I pretty passionate about NOT changing data that my tools receive. It takes a specific, documented situation for me to say "you give me X, and I make it Y". I am more comfortable with "your address, X, has been standardized to Y. Do you accept?".
Similarly there are many sites that allow you to log in using `your password` or `your password`.swapcase() (for example, Password123 or pASSWORD123). Automatically trying a variant only costs a single bit of entropy and can greatly reduce login issues
This doesn't always work by the way. When you venture outside of ASCII, it's quite often uppercase(lowercase(x)) ≠ uppercase(x) and/or the other way around.
The German letter ß gets uppercased to SS instead of ẞ by most libraries in a neutral/generic culture. ẞ on the other hand gets lowercased to ß.
This happens because there wasn't an official ẞ in German until recently but the uppercasing/lowercasing standard was already written for ß.
This doesn't particularly matter since there's a perfectly good fallback for people who do happen to have "exotic" passwords: just type the password in correctly. As long as your mapping is consistent, it's totally fine if the bit sequence you're hashing is linguistically nonsense, because you're never going to display that to the user.
What you don't normally want to do is normalize passwords before hashing and then only store the hash of the normalized string, because that's fragile to changes in your normalization algorithm, e.g. updating your Unicode data tables.
Most competent websites I know accept general UTF8 characters like emoji perfectly fine. There are a lot of crappier websites that don't even have proper unicode support for usernames or profile descriptions out there, though, so your mileage may vary.
As far as I know, there's nothing preventing a password field from containing any valid unicode string. The problem may be IME support or servers stuck in ASCII, but the textbox itself will just work.
Even surprisingly big names are surprisingly bad at this. Don't know recently, but Hotmail/Outlook used to have a rule of only using letters, numbers, and a handful of symbols, also limiting you to at most 16 characters or something. You couldn't even type a space!
They're not necessarily "bad" at it; there's a good chance they just want to make sure that the least competent of their users doesn't make a password that they have trouble with later. They don't care that security-conscious people get frustrated with it.
So I guess that could also be "bad," but not incompetent "bad" or Michael Jackson "bad."
This is much more excusable for email providers to prevent phishing. There are a ton of unicode points that indistinguishable from ascii letters. There are other security issues that can arise as well. Here is an example from spotify https://engineering.atspotify.com/2013/06/18/creative-userna...
I should have specified - this was (is?) for passwords, not usernames. I'm much more sympathetic to limited character sets in usernames, but I don't see much valid reason for doing so with passwords
Honestly, that stuff only proves that big name websites aren't necessarily competent. PayPal used to let you register an account with a password longer than the maximum password length used in the authentication code, for example, essentially allowing you to set a password you could never use with your account again. Being worth billions doesn't mean you've got all the basics down, it just means you've tricked many people into giving you their business.
Even good websites that will accept any valid password string will sometimes cut off the last part of a long password because their hashing algorithm throws that data away. Bcrypt, for example, supports a maximum input length between 50 and 72 bytes, depending on the library you use to hash your passwords. That's bytes, not characters!
More primitive systems used to have problems with non-alfanumerical passwords and once those algorithms have been unleashed upon the unsuspecting public, you need to support them in your login flow for years to come.
For what it’s worth the “big” company I work for stores usernames in MySQL. 15 years ago when the username column was created it was set for ASCII (or whatever legacy charset it was). Changing it to utf8 would be a royal pain in the ass, requiring all kinds of testing and crazy updates across the entire company.
So while we’d love to make it utf8, it is just too much work to justify doing over other things.
I should have noted - i was talking about their restrictions for passwords, not usernames. Since those are hashed before storage, i think there are far fewer excuses for such limitations.
What if the password includes İ. The swapcase would be i. And the again its swapcase would be İ. And swapcase of I is ı. And swapcase of ı is I. Right? Well, it should depend on what language you use. Or should it?
Also I think this was in Github; they ask uppercase and I enter Ğ and Github doesn't recognize it as uppercase letter.
For a long time, in some browsers/OSes there was a bug (or perhaps an archaic feature that was accidentally triggered) where the cursor in an input could get stuck and cause all new characters to be inserted to the left; I'm assuming it's related to that.
Notably this was a bug on the input for _setting_ your password, so if you think you've set Password123, you might have actually set 321drowassP, so even after fixing the bug it would still bite many users.
This is the first I've heard of it, and as a Linux user I feel like it's the kind of thing I'd either know about or experienced first-hand. What kind of system would do that? And "for a long time", like, you can't ever login anywhere, it's kind of obvious and breaking functionality badly, how can this exist for more than a single release if at all?
To be clear, it’s intermittent. Perhaps one in 500 times an input is focused, it exhibits this behavior.
I have experienced this so many times over my life with so many different hardware/software configurations, and I have to assume others have as well. It hasn’t happened in years but could explain why the “fix” described in the parent post was implemented.
I recently had to create accounts for work benefits at TWO different sites that had user name complexity requirements, and actually rated the strength of my user name! That's something I had never seen before, and it seems pretty misguided.
The worst of these also had a 20 character password limit (at least it wasn't 8!), along with several of these nonsense requirements that limit repeated characters. I couldn't manage to generate a password they would accept. Eventually I realized that not only did they allow only certain specific special characters, but their password length validation was wrong and would only accept 19 characters because they were testing for <= 20.
This doesn't bother me that much, but what really grinds my gears is how many sites won't let you log in with the correct username and password. I don't care enough about the account to want to set up 2FA, and I'd rather preserve a bit more privacy by not sharing my real phone number or another email address. Some sites seem to insist, and I think it's more about advertising and anti-spam than actual security.
Yahoo seems to be big on this these days. I had an old Yahoo account that I don't use much, but every time I try to log in, they seem to change around exactly what pseudo-2FA they want. Now they won't even let me try to type my password. Good grief, guess I'll just write off that account.
Or the ones that had silently truncated your password (e.g. JetBlue truncating to length 10) in every input and login box so the stored password isn’t what you think it is.
Then one day stopped doing truncating … but only in some boxes, not others.
I think there have been so many password breaches that most sites feel that passwords alone are insufficient for security purposes. Most users reuse passwords and if sites don't enforce 2FA they open themselves up to batch account compromises via script kiddies trying all combos found in the username/password dumps found on various haxor forums.
This is especially true for sites that provide email accounts for users on sites like yahoo which are the 2FA for many other sites a user has accounts on. Gaining access to a yahoo user's email account could allow someone to reset all their passwords on any 3rd account they used that email address for when they signed up.
This doesn’t seem particularly alarming. Googles account security is above and beyond the rest of the web right now. I doubt a single attack has been made realistic by this feature.
Due to my work I travel a lot (and by that I mean that I can be in new city every day). I don't want to give google my phone number or my other email, so my account is not tied to anything, the only factor is the password.
Turns out if you try to log in from new device from new location (I guess your account is tied to IP from which it was created), just password alone isn't enough to log in. And there's no alternative way to prove that account actually belongs to me.
I understand that most services try to provide you with good security, but I hate it when everything is overcomplicated and everyone tells me what to do. No, I don't want to give you my phone number. Yes, I know I won't be able to recover my password, I don't need that. Yes, I actually want my password to be this long. No, I don't want you to block login attempts from new locations. Believe it or not, people do travel and want to log in from more than 1 city. It's none of your business if I want to give my password to my friend and let him login, just stop this please. Let me choose whatever password I want without any backup emails, phone numbers, and let everyone who knows the password log in. Is it too much to ask?
Why is it even allowed to let users create accounts without providing a phone number, but then not letting those same users to log in because their accounts are not tied to any phone number? How does it make any sense?
As an ex-Googler in the Information Security Engineering team who has looked at our implementation of password authentication, I confirm this is a feature, not a bug. (Some old mobile devices auto-capitalize of the first character typed in a text field.) That said, I can't remember of the top of my head if we just ignored the case altogether or if the logic was more restrictive (eg. if first char is uppercase, also allow its lowercase version.) Last time I looked at the code was 6 years ago.
Pretty sure this is a detail documented somewhere public-facing
For a very long time, Chase bank public websites only accepted the first 8 characters of your password. Anything else was silently dropped. If you used Chase for any loans, credit cards, or banking, you were forced to change your password around 2016ish, this is when they finally resolved this problem. Why? Mainframes.
Bank of America, internally, required you to have two passwords, a Windows and a UNIX password. The UNIX password was only 8 characters due to , you guessed it, mainframes. I don't know if this was ever resolved.
I was briefly an intern in a company in which the only part of the password that was checked were the first four characters. I don'y know whether it was due to using PIN numbers in the past or they just wanted people to feel safe but not call IT constantly about the account not working...
I wonder whether they're still doing that...
It's because early mobile keyboards would default to automatically capitalize the first character at the start of an input, and apparently did this behavior to password fields as well. Facebook has also had this password behavior for at least 10 years.
Yeah we did it this way on an app I worked on in the past, try the verbatim input and then a couple of minor variations in casing if it didn't work.
I've also found that for email fields you need to be careful to normalize the input (trim, casing) as safari had a habit of autocorrecting the first character to be a capital
I find apps that don’t trim the whitespace for the email field so annoying in terms of UX. I usually use a Text Replacement shortcut to fill in my emails (e.g. “gml” fills in my GMail address, “cld” my iCloud address etc.) and that always inserts a space after the email and I have to manually fiddle with the cursor to delete it.
>I've also found that for email fields you need to be careful to normalize the input (trim, casing) as safari had a habit of autocorrecting the first character to be a capital
Why is that relevant? The standard technically allows for case sensitivity but nobody does it
It can be problematic when you go to look up the account by email address during login and it isn't found due to inconsistent casing.
It's technically true that the part before the domain can be case sensitive, but as nobody does this the gain in UX from people not having to know the exact casing used during sign-up is worth it to me.
Ah, it's a bit ambiguous though: not GP, but I read you as meaning do they store both versions' hash and check against either.
Actually I realise GP is equally ambiguous. But I read that as (and my own assumption would be) frontend retries with the variation, backend verifies against the same only one stored.
If you're actually using a 'strong enough' hash to prevent easy cracking if your hashed password database is leaked then you're doubling the server load which can be quite substantial in some cases.
Because you are changing the daabase schema to introduce a stupid version field to store "normalized" passwords rather then just doing the check twice on mobile platforms.
Hashing takes a lot of CPU time. And btw you don't even need to change the database schema. You could encode the version in the password field itself. Django does this and it works great
The database would contain existing passwords without normalization.
You also you have to hold the unnormalized password.
Super silly to do that to save a few processor cycles on login.
It's amazing how much misinformation is in this thread. You should do further reading on password hashing and rethink whether you really have to store two different passwords...
Sounds like the requirement might be for the case insensitivity of the first character to only be for some platforms (eg mobile devices where autocapitalisation might have happened).
In that case this solution would have the disadvantage that it wouldn’t be platform specific.
Of course they hash the password. Of course they don't know the capitalised version of your saved password, but they can know the capitalised version of the password you just entered
But how did they know which punctualion characters to remove from the password?
You may try 2 versions of first letter, but do they go as far as bruteforce removing all the % character combinations from the password, unless they did remove them all?
Assuming the password is sent over the wire (rather than the salt being sent to the client, the client doing the hash, and sending the hash), the password will be stored in memory while the login process runs
In practice is there really any difference between allowing a client to try 10 passwords before 'lock out' (say no more attempts for 10 minutes), or try 5 passwords before hand.
Certainly it’s not definitive though. This could easily be accomplished by storing multiple hashes, or multiple password checks that alter the user input, but still have Google keeping hashed passwords. Definitive example could be something like them doing a password recovery where they send you a plaintext version of your current password.
This reminds me of an intern project in my friends company where they stored all hashes a few hamming distance from the password, so even with typos you would get logged in.
Iirc it had a cool demo, but was never used in production.
I have recieved a few notifications of login attempts from Windows 8 phones. I wonder if there is other security allowances for these devices making them the ideal platform for launching attacks.
PS. You can see this for yourself, just leave a negative review for Staples Canada on google and your account will be attacked from somewhere inside Vietnam via windows phone.
Ever call Fidelity phone support and hear "enter your password on the keypad"? That means collapsing ~62 chars into 10 char options, a massive space reduction.
Then there's the fact that many banking sites (BofA, IIRC) only used the first 8 char of your password anyway.
Yikes, I didn't know that. Seems like I need to make my fidelity password 6 times longer.
Does this also mean they probably store passwords in clear text? Because there's no way to normalize the numeric passwords back to letters and symbols.
It doesn't mean that they store in cleartext, but they may as well.
They can generate the phone password on the client side and send both passwords to be salted, hashed, and stored separately.
That much seems OK.
But the salted+hashed phone password is incredibly weak. It can be brute forced readily unless it is very long.
From the brute forced phone password, the regular password can be brute forced as well, since the digits of the phone password tremendously constrain the characters of the regular password.
It's very much like the Hollywood hacking where the hackers progressively lock digits of your password and eventually discover the whole thing.
This has been like this for at least a couple years now. Struck me as bit odd in the beginning but it doesn't really improve the chances of brute forcing too much (which is hard on Google login anyways). And potentially saves so much time and server resources
It got me thinking - imagine wanting to let users log in with a single character typo in their password, could you do this without storing hashes of all edit distance 1 passwords?
(I just work up and my brain is not yet fully functional, so what follows is probably totally stupid)
Notation: "||" means string concatenation.
Let password P = P1 || P2, where len(P1) + len(P2) = len(P) and |len(P1) - len(P2)| <= 1.
Let H1 = hash(P1), H2 = hash(P2), where hash() is a cryptographic hash function that produces at least as many bits as the longest allowed password and satisfies whatever slowness and memory use requirements that you have for a password hash.
To store a password, store P ⊕ H1 and P ⊕ H2.
To check a password candidate C received at login, let C = C1 || C2 using the same splitting rule as used for P above, and compute C ⊕ hash(C1) and C ⊕ hash(C2).
A login is successful if either P ⊕ H1 or P ⊕ H2 is within edit distance 1 of either C ⊕ hash(C1) or C ⊕ hash(C2).
(I've omitted salt from the above for simplicity. Replace the hash with a salted hash if you want salt).
You could brute force all possible changes but that could take quite a while for longer passwords, you probably don't want to do that in production on every login.
doesn't that mean they are storing plain text/reversible encrypted passwords?
I have gmail and facebook accounts way before mobile was invented, if they've added that feature for mobile imho it means password was stored in plain text or with 2 way encryption
No, you can just change the password you receive and test several versions against the hash. So you stored your password with a lower first character in 2011 and now you enter it with a captilized first character. They can just hash it like you sent it to them but on top they can also first lowercase it and then hash it and that will then match the hash from the password you used when signing up.
This is because blizzard uses a cryptographically secure, non-disclosing, challenge and response protocol called SRP6 to authenticate users, rather than a password hash database. The password is not stored server side, but the client is able to prove it knew the original password based on its relation to a private/public key pair generated as part of the authentication scheme.
Facebook actually accepts three forms of your password:
* Your original password.
* Your original password with the first letter capitalized. This is only for mobile devices, which sometimes capitalize the first character of a word.
* Your original password with the case reversed, for those with a caps lock key on.
[0]: https://www.zdnet.com/article/facebook-passwords-are-not-cas...