"punycode" does seem to break a few things in userland, but obviously should exi...

buckminster · on Nov 16, 2020

Why worry about the dictionaries? No legitimate user will accidentally enter a Greek Alpha rather than an A, for example. If a malicious user does this it won't resolve anyway because the domain is banned by the rules and can't be registered.

ricardo81 · on Nov 16, 2020

Yeah, perhaps less important for LinkedIn but more pertinent for other parties like domain registrars and TLS cert issuers- apparently the 'rules' ala the registry's DNS is an evolving thing.

https://www.theregister.com/2020/03/04/homograph_attacks_sti...

Ignoring the specific characters without a local frame of reference would require at least a DNS lookup.

tinus_hn · on Nov 17, 2020

This kind of tricks is used by sending mail to users with legitimate looking links, which can be used for phishing.

BlueTemplar · on Nov 16, 2020

Am I misunderstanding or are you looking for Unicode normalization?

http://site.icu-project.org/design/normalization/custom

ricardo81 · on Nov 16, 2020

I maintained a domain database and normalised to ASCII with libidn, sometimes the input data was not from zone files and preferably would've been able to double check the characters used in a potential domain to ensure it's something that's registerable, without any network required. That was my motivation for looking into the topic originally.