Hacker News new | past | comments | ask | show | jobs | submit login

> That makes no difference whatsoever. All UTFs are mappings of USVs to bytes.

As a last resort to try and clarify what I meant in my original post (requoted below):

> The only reason I can see is in ensuring that text is losslessly convertible to other UTFs, particularly UTF-16 (which exists for historical reasons)

What I mean here is that if converting to a UTF is important, then maybe restricting strings to code points or Unicode scalar values is justified. If textual data is stored in bytes that are conventionally UTF-8, there should be no need to do any conversion to a UTF, since ultimately the only UTF that is useful should be UTF-8. All you would be doing by "converting to a UTF" is losing information.

That was my last attempt. I'm sorry if you still can't understand it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: