I happen to think baking unicode into your concept of a string is fundamentally ...

dnautics · on March 17, 2019

I could be wrong, but I think the reason why String.length is one is to have a consistent idea of what happens when you have monospaced console output. Things in the elixir standard library exist "when you need them for elixir itself", and monospaced console output formatting working is needed in a few parts of elixir. If you care about bytes only, you can use byte_size, as indicated in the docs.

naniwaduni · on March 17, 2019

No, codepoint length is totally useless for monospaced console output, see the third example. Grapheme clusters are closer, but still wrong in the presence of wide characters.

dnautics · on March 17, 2019

I've written a fuzzing library that tests random Unicode inputs and the width of the output was sensible on three platforms (Linux, Mac, and powershell).