Javascript strings are not UTF-16, you'd only ever see codepoints if that were t...

wereHamster · on Sept 13, 2015

ES6 did get some new functions to correctly deal with surrogate pairs in strings. In then end, JS strings are just an sequence of 16 bit values, with the unfortunate case that many string functions interpret those as UCS-2 and only some new functions as UTF-16.

When you come across an invalid sequence while decoding particular input (like "\ud83c") then you generally have three choices: throw an exception, skip the invalid part, or replace it with a replacement character. The default JavaScript behaviors is to be lenient. But if you need more control over the decoding behavior then you can use StringView or TextDecoder which is part of this spec: https://encoding.spec.whatwg.org/

masklinn · on Sept 14, 2015

> ES6 did get some new functions to correctly deal with surrogate pairs in strings. In then end, JS strings are just an sequence of 16 bit values

Which is exactly why they are not and can not be UTF-16.

> The default JavaScript behaviors is to be lenient.

The javascript behaviour is to have UCS2 "strings".