> The only reason I can see is in ensuring that text is losslessly convertible to other UTFs, particularly UTF-16 (which exists for historical reasons), but this just seems like a matter of when the information is lost (is it during conversion from string to UTF-16, or from bytes to string), not if it is lost.
And you can even avoid losing that information at all with encodings like WTF-8: https://simonsapin.github.io/wtf-8/