Hacker News new | past | comments | ask | show | jobs | submit login

> There isn't any such thing as "characters" in code. In documentation when they say "characters" usually they mean bytes, code units or code points. Almost never do they mean graphemes, which is intuitively what people think they mean. The bottom line is two-fold: (A) always understand what is meant in documentation by "length in characters",

This is because languages usually have a built in char type.

> don't try to use graphemes as your unit of length, it won't work in practice.

Swift does this and it's a really good thing. Everything is in graphemes by default -- char segmentation, indexing, length, etc.

There are way too many problems caused by programmers interpreting "code point" as a segmentable unit of text and breaking so many other scripts, not to mention emoji.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: