That’s fair. Is that what links without Unicode support does? (Ignore all byte sequences it doesn’t recognize.) Also, I’d still love to know why you prefer stripping out non-ASCII characters — does this sentence become more readable to you with the em dash omitted?
It just simplfies things for me. If I can read text without Unicode, then I dont need it. Its one less variable I need to worry about. Maybe another way to look at it is cost-benefit analysis. I just dont get much beneft from Unicode in the console (I'm usually just reading text) whereas it almost always causes problems from time to time.
I can see a dash in 7-bit ASCII. I am not going to lose the meaning of a sentence by forgoing a few Unicode chaacters.
Why not?
"If the alternative is garbled text, is that what you choose?"
No. The alternative is not garbled text and thats not what I choose.
The alternative for me is a subset of ASCII. I choose what characters I will accept, delete the rest.
For example, something like
This has worked for me for several decades. Nor am I the only one who uses this approach. I once saw an HN commenter say their favourite regex was