Hacker News new | past | comments | ask | show | jobs | submit login

"Documents must be utf-8 and should have a byte order mark."

No. If you're using UTF-8 (which is a good choice), the use of a BOM should be discouraged. Given that the format specification says documents MUST be UTF-8, there is no need to enable detection of UTF-8 content with the UTF-8 BOM. And, of course, the original purpose of the BOM (detecting big- or little-endian encoding) is unnecessary in UTF-8.

The Unicode standard, section 2.6, says, "Use of a BOM is neither required nor recommended for UTF-8".

While it is allowed, if you're making a new spec for a new data format, you shouldn't recommend the use of a BOM in UTF-8.




There exists a charset that is more efficient than UTF-8.


I am futureproofing.


> There exists a charset that is more efficient than UTF-8.

> I am futureproofing.

Which is true: such charset does exist today, or you merely have prepared for that in the future?


The charset is not published, when it does eventuate byte order marks safeguard the interpretation of documents.


Which is…?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: