Hacker News new | past | comments | ask | show | jobs | submit login
A string of unexpected lengths (hackerschool.com)
45 points by thomasballinger on Feb 20, 2015 | hide | past | favorite | 6 comments



Swift's String class (https://developer.apple.com/library/ios/documentation/Swift/...) uses extended grapheme clusters as characters. If you need the raw Unicode scalars or UTF-16/UTF-8 code units, there are separate "views" into the string that let you iterate over these.

Using extended grapheme clusters as characters means truncation, concatenation, length measurement and transformations like reversal all work in the expected way, even in the presence of combining characters. More standard libraries should consider adopting this model!


How do you find this in practice? I mention this in the 3rd footnote of the post, but didn't have much to say about it because I haven't used it myself. The one person I've talked about this says it's rather awkward right now because Swift doesn't provide great tools for working with these objects.


So does Swift use a rope? Or is getting the Nth character of a string O(n) time?


Here I thought everyone learned this when they started colorizing their bash prompt and suddenly their terminal session would wrap the line before the cursor got to the right side of the screen :-)

The article fails to discuss non-monospaced fonts, which make this problem even more pronounced. For a long time Word would screw up a ligature if it also included a color change. The only hope is length(source_data, rendering_environment) as a function.


> The article fails to discuss non-monospaced fonts, which make this problem even more pronounced

I think proportional fonts makes this issue less pronounced because for more people it makes intuitive sense that you can't get string width by just counting characters, and thus they end up using proper measurement functions.


That is an excellent point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: