Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The "tokens" you're thinking of are "grapheme clusters" in Unicode.

Unfortunately just reversing by grapheme clusters doesn't solve the problem because of directional formatting codes; if you have e.g. a right-to-left embedding followed by a pop directional formatting you can't naively reverse them.



Grapheme clusters are a poor approximation of the vaguely-defined linguistic-level concept you're groping for.


Well, yes, but we gotta stop somewhere or just give up any hope of computers operating on text.

Although I think grapheme clusters are a pretty good approximation in that it's usually what you want to backspace in a word processor.


Is there a better approximation?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: