Hacker News new | past | comments | ask | show | jobs | submit login

A, the famous hangul filler. That's actually a Unicode bug they refuse to fix for some years now. It's still listed as identifier. I fixed that in my interpreter cperl.

The next bugs are actually all JavaScript bugs, as they accept Unicode identifiers but don't check against the Unicode security guidelines, ignoring any profile. Accepting bidi, mixed scripts, unnormalized identifiers. This is very common, 99% of all interpreters and compilers don't care about Unicode security at all. They are rather proud to accept everyone, and point fingers at colleagues who only accept ASCII english.

Identifiers need to be identifiable by a human. That's the whole point. And the system needs to block illegal identifiers.

Similar to filesystem drivers, which consists of path names as identifiers, but the driver writers think they are beyond such human issues. For them there is only garbage in, garbage out. Their pathnames are certainly not identiable. A directory can consist of bidi names, or Russian and Greek mixed scripts who all look the same. Or just not normalized. There can be a multiple of visually duplicate names, and you never know which is which. At least with domain names they came up with a punycode solution, but this was only the tip on the iceberg. And it was a rather awkward workaround.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: