It's sad and odd that Rust and (probably especially) Swift are missing from the ...

chrisseaton · on May 30, 2019

Why are there interesting technical differences in the way those languages do things compared to the other examples given?

The author obviously can't cover all languages and strategies in a short article can they?

saagarjha · on May 30, 2019

Yes: Swift groups by grapheme clusters, and Rust makes it difficult to do byte indexing.

masklinn · on May 30, 2019

> Rust makes it difficult to do byte indexing.

Not sure how. If you want to get a specific byte, just convert to a bytes slice (that's free) and index that. And you can slice strings (using byte-indexed indices), but your boundaries have to fall on codepoint boundaries. The only thing that's difficult is getting a codepoint at a specific index (byte or otherwise).

paulddraper · on May 30, 2019

> just convert to a bytes slice (that's free) and index that

Byte indexing of strings.

If you explicitly convert your string to bytes, yeah then naturally it's easy to byte index.

afiori · on May 30, 2019

> Not sure how. If you want to get a specific byte, just convert to a slice (that's free) and index that.

But then it is not automatic to cast that slice as a string.

burntsushi · on May 30, 2019

If you want a single byte and `s` is a `&str`, then `s.as_bytes()[i]` returns a `u8` in `s` at index `i`. If the index `i` is out of bounds, then it panics, but no other UTF-8 checking is performed.

You do not need to do this if you're slicing. For example, if you know that `i..j` indexes a valid UTF-8 subslice of `s`, then `&s[i..j]` returns a subslice of `s` with type `&str`.

The only reason to subslice `s.as_bytes()` is if you want the raw bytes which may or may not be valid UTF-8. And in this case, it is a good thing that it is not automatic to convert that back to a `&str` since it may not be valid UTF-8.

afiori · on May 30, 2019

> it is a good thing that it is not automatic to convert that back to a `&str` since it may not be valid UTF-8.

My comment was unclear in meaning, but the aim was to point out exactly this.