I think the recommendation to disallow any non-ASCII character is throwing out t...

josteink · on Nov 10, 2021

> I think the recommendation to disallow any non-ASCII character is throwing out the baby with the bathwater.

Not throwing out all non-ASCII characters from code-files. Just throwing them out as being invalid identifiers in your code (think variables, function-names, etc).

> How about code that wants to display some emojis?

Fine. You quote that emoji in a string, and it's golden.

You try to make a variable with the name of an emoji however, you code crashes.

That sounds fine to me.

speleding · on Nov 10, 2021

That would close this particular attack (but not the BIDI one the article mentions). But there is probably already too much code out there with π=3.14 in it to be feasible to do this.

smcl · on Nov 10, 2021

I really thought that using the greek letter for pi (or theta, etc) was something you do to show your programming language supports unicode identifiers but that nobody actually does in real life. I wonder how people input this, do they know the Alt+xyz combo, do they select-copy-paste or is there another way that to write these characters that I'm not aware of?

Just to be clear, I don't mean people who are actually using Greek language for input - it's pretty obvious how they would type that character :)

speleding · on Nov 11, 2021

pi is simply alt+p on the Mac, pretty easy to remember.

josteink · on Nov 10, 2021

> But there is probably already too much code out there with π=3.14 in it to be feasible to do this.

So for JS let it break in new, module based strict-mode code.

That’s going to be processed by tooling prior to shipping anyway, so that’ll get caught.

For other platforms do the same. In some forward-looking revision of the language/compiler.

People has to fix obsolete/deprecated stuff in newer compilers/class libraries all the time. This is no different.

YetAnotherNick · on Nov 10, 2021

Do you really have to write emoji in the code string? Similarly with international language characters. The sane thing is to use either json config files or i18n libraries.

speleding · on Nov 10, 2021

If you are writing something intended for a single audience using i18n libraries can be unnecessary overhead. And emoji can also be icons like ⌘ that can be useful to display in the UI.