My favorite Polish language-related computer problems:
1. windows installing both programmers' and typists' Polish keyboard layouts by default + shortcut to change between them (I think it was alt+shift+space or even alt+space ?). Most people had no idea there even are 2 layouts and if they randomly pressed this shortcut and logged out - the next time they tried to log in - Z and Y are switched and if they had Z or Y in their password - they can't log in. This one caused millions of support calls.
2. corporate computers with English and Polish layout installed - if you happen to have layout switched to English and try to write an email in outlook - you will likely start with "cześć" ("hi") and when you press alt+s for "ś" it will send the "cze" as the full email. "cze" is very casual "hi" like writing a business email: "yo".
3. same as above but in eclipse with some version control plugins alt+s committed the code. Especially frustrating in cvs/svn.
4. python2 code often breaks if you have Polish letters embedded in code as string literals. It depends on the default system coding page vs default python coding page and other stuff, for some reason it's not utf-8 by default in python2. The solution is to use python3 or mess up with the default settings.
Surely this is down to one key error (pun intended) - Medium deciding to override a standard browser shortcut 'for the good of their users'. If they needed a manual save function then I might understand it, but they tried to be clever and made things worse in a subtle way.
That's a really well-written blogpost! I was expecting it to be much more surface-level (something to do with character codes - still possibly interesting!) but it had new info for me on several levels (the background of the bug, and the personal history stuff didn't feel too fillery to me).
An interesting fact for non Polish speaking readers: for non-official writing we often don't use diacritic characters at all. It makes writing faster. With the raise of spell checkers it fades out, but still, if you write without diacritics often you will be well understood.
Second interesting fact: it is very popular for software and online apps, especially not developed in Europe to ignore diacritics. Not only polish ones, but also french, german etc. You get weird characters instead or can not write properly altogether. I hope the article will put a highlight on the issue.
Sure, but sometimes leaving out Polish diacritics makes the whole sentence ambiguous, or at the very least harder to read. I personally despise people doing that.
More to the article's point: there were countless times where I accidentaly sent unfinished email by trying to type "ś".
It's surprising how much software (mostly on Windows) doesn't properly handle Unicode in 2021. With something working in Unicode, it's not that big of a deal to both handle letters like Ł or ż and also to run normalization on text strings so that you can (if desired) treat Łódź and Lodz as identical (e.g., for text searching).
Just a note that normalization is not the same as diacritic insensitivity. Normalization is the process by which strings that are semantically equivalent (by some standard), yet have different underlying byte sequences, are transformed to have the same underlying byte sequence. For instance, replacing “e, combining acute accent” with “e with acute accent”.
I wonder how much computers and spell-checkers are reducing the evolution of writing systems?
In the past people could naturally stop using diacritics or make letter substitutions. Over time the language might eliminate their use. That seems less likely now.
Similarly introducing a new letter seems rather difficult in the computer age.
It's popular for some reason even in countries where it doesn't save keystrokes. For example in Croatia we have 5 characters with diacritics (š, đ, č, ć, ž), all of them have dedicated keys on the keyboard and yet many people have a habit of simply not using them.
In German, ä would never be replaced by a (if all you have is ASCII the proper replacement is ae), except by foreigners who assume that diacritics don't matter.
I've noticed that I automatically lose some respect towards a person if I find that they don't use diacritics in their writing. Especially in a professional setting.
seems to me like an exceptionally strange choice. Why not an exclusive-or? The thing they want to avoid is a false-positive on both being pressed, so test for that directly.
> Also, Javascript doesn't have a logical xor operator, so trying to do that would potentially reduce readability.
I also didn't know about any operator to logically xor two boolean variables (thought about (ab)using JavaScript's implicit type conversion mechanisms: `x ^ y`), and then I learnt that `!=` works fine as a logical xor for booleans. Tada!
I don't know how much readability is reduced by this.
The only place XOR could be used is to replace the OR. As you say, replacing the AND NOT with an XOR would supress Alt + s.
Even replacing the OR with XOR is not obviously an improvement as it is not clear that we know what the correct behavior would be in the edge case where a keyboard event is emitted with both the control and command flags set.
This is a really common failure mode - people forget to explicitly assert that the other modifiers are off when checking for a modifier being on. I had to go through and fix all the ones in matrix-react-sdk (element web) a few years ago: https://github.com/matrix-org/matrix-react-sdk/pull/825/file...
1. windows installing both programmers' and typists' Polish keyboard layouts by default + shortcut to change between them (I think it was alt+shift+space or even alt+space ?). Most people had no idea there even are 2 layouts and if they randomly pressed this shortcut and logged out - the next time they tried to log in - Z and Y are switched and if they had Z or Y in their password - they can't log in. This one caused millions of support calls.
2. corporate computers with English and Polish layout installed - if you happen to have layout switched to English and try to write an email in outlook - you will likely start with "cześć" ("hi") and when you press alt+s for "ś" it will send the "cze" as the full email. "cze" is very casual "hi" like writing a business email: "yo".
3. same as above but in eclipse with some version control plugins alt+s committed the code. Especially frustrating in cvs/svn.
4. python2 code often breaks if you have Polish letters embedded in code as string literals. It depends on the default system coding page vs default python coding page and other stuff, for some reason it's not utf-8 by default in python2. The solution is to use python3 or mess up with the default settings.