Isn't it bizarre that we have self-driving cars, the ISS, and phones with 50 meg...

tetha · on Oct 20, 2021

Character encoding is in a special class of problems. Like time handling.

If you pick up a halfway non-ancient framework in a somewhat common language with a somewhat non-terrible persistence like postgres, you just don't have problems. Just don't care, and it just works.

But it's super easy to derail that fragile correctness with something like MySQLs utf8-ish handling, or some OS's path handling, or 'efficiency', or a user or frontend dev submitting data in a wrong encoding. And then it gets mangled. And then the user is unhappy.

At that point, it becomes very hard to argue why one of the two things is wrong, and the other is not. While the user argues the other way around. Because both look correct, if you look from the right angle. And the only reason why I am right is because of some standard, while the customer is right because of money.

And yes, it is very 'surprising' why our software now functions correctly for russian or greek customers.

ctdonath · on Oct 21, 2021

That it's a special class of problems doesn't mean it shouldn't be solved by now. Time handling should be solved too; amazing that an iOS app can't get current correct GMT.

quadrifoliate · on Oct 20, 2021

It's not bizarre at all. Character encodings are a sort of language in themselves, and end up with all the problems that regular old languages have – there's a lot of variety, people can't agree on one particular solution, and there's not a lot of money in taking care of the edge cases.

It would be bizarre if we were at the point where we had perfect translations for everything, but still struggled with character encodings specifically.

zakius · on Oct 22, 2021

for self driving cars, ISS and digital cameras everything you do is blurry in a sense, "good enough" approximation is actually good enough while character encoding and transformations have to be done perfectly and precisely and have surprisingly big number of edge cases