Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Why?"

Why not?

"If the alternative is garbled text, is that what you choose?"

No. The alternative is not garbled text and thats not what I choose.

The alternative for me is a subset of ASCII. I choose what characters I will accept, delete the rest.

For example, something like

   tr -cd '[\12\40-\176]'
This has worked for me for several decades. Nor am I the only one who uses this approach. I once saw an HN commenter say their favourite regex was

   tr -cd '[ -~]'


That’s fair. Is that what links without Unicode support does? (Ignore all byte sequences it doesn’t recognize.) Also, I’d still love to know why you prefer stripping out non-ASCII characters — does this sentence become more readable to you with the em dash omitted?


It just simplfies things for me. If I can read text without Unicode, then I dont need it. Its one less variable I need to worry about. Maybe another way to look at it is cost-benefit analysis. I just dont get much beneft from Unicode in the console (I'm usually just reading text) whereas it almost always causes problems from time to time.

I can see a dash in 7-bit ASCII. I am not going to lose the meaning of a sentence by forgoing a few Unicode chaacters.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: