Hacker News new | past | comments | ask | show | jobs | submit login

The language maintainers have already decided to slow down string-construction by calling StringUTF16.compress(value) which checks if the input can be converted into LATIN1 and if so, creating a LATIN1 byte-array representation from the ground up.

Compared to this, I wager that the incremental cost is small to verify that the coder matches the content.

Also, depending on the implementation of string-interning and StringUTF16.compress, it's possible that the verification step is only needed for non-LATIN1 strings. Which according to the JEP is only a small minority of strings seen.

If there is a cheaper solution, I'm certainly all for it. I'm just spitballing ideas here. But I don't think it is acceptable to allow "hello world".startsWith("hello") to return false.




I don’t think you need any defensive copy, just to have compress switch to utf16 mode when it encounters a non-latin1 char and go on from there.

It would likely be faster since currently the implementation restarts the entire conversion from 0 in utf16 mode.


Making the 99% case faster at the cost of making the 1% case slower, is a speedup.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: