UTF-8 works fine if you truncate a codepoint because the encoding scheme lets you detect this. The problem is more subtle than that (hint: it involves a 1-byte codepoint).
Truncating a UTF-8 codepoint is not fine because most software is not tested with partially broken UTF-8 so international users will likely run into many bugs.
Especially because concatenation is a very common operation so those sliced codepoints will be everywhere, including in the middle of text.
Morally I view “what do I do with my truncated string” to be a separate issue from “how do I truncate the string” as described in the article. Like, yes, you absolutely should not concatenate after doing this operation. But maybe you shouldn’t be showing the user a truncated string either even if it’s all ASCII. The question of “did you make an unparseable UTF-8 string” is answered with “no” and the more complicated but also more interesting question of “did you actually want this” remains unanswered.
If you're alluding to NUL, I don't really see the issue?
Yes, many languages allow strings (UTF-8 or otherwise) to contain null bytes, and C's str*() functions obviously do not, but null-termination vs not is an orthogonal issue to ASCII vs UTF-8.
i.e. Yes it's (depending on context) an issue that C str*() cannot handle strings with embedded null bytes, but that's not a UTF-8-specific issue.
One problem here is that the string may not have been a correctly formatted UTF-8 string to begin with. No, not that it can contain any bytes-I mean, it might be ascribed even more than just decoding correctly. Maybe it is supposed to have the grapheme clusters preserved. Maybe the truncation should peel off the last file component because the string holds a path. The operation of “doing a dumb truncation” can be broken if you look at it from plenty of ways, and I don’t disagree with you, but I do want to make clear that the issue isn’t memcpy is breaking it but that if you need x, y, z maybe you’re reaching for the wrong tool. And conversely there is nothing inherently wrong with using it if you are going to use it in a way that is resilient to that kind of truncation.
What about a function that can turn a correctly spelled english sentence into a malformed english sentence? If you truncate to a fixed length this comes with the territory.
You could have just said it rather than going through this smug "I know something you don't know!" song and dance.
Also, by this rationale, NO string is ever safe in C, because pretty much every encoding technically supports codepoint 0 (even though you take your life into your hands should you ever use it). This is not a useful discussion.
By that metric, C can't represent ASCII correctly either, because there's no particular reason you couldn't have a NUL character somewhere inside a string.