Hacker News new | past | comments | ask | show | jobs | submit login

I have no experience with Korean and Unicode but, I guess, it's jamo composition?

With it, there are two distinct and not-really-compatible (since anyone in real world practice rarely cares about Unicode normalization) ways to specify a single Hangul character. And that should make text processing a bit painful.




Anyone who writes software to deal with unicode really better care about unicode normalization!

You are going to have all sorts of edge case bugs and unexpected behaviors if you don't, definitely not limited to Hangul. There are multiple un-normalized ways to write all sorts of on-screen glyphs, including Latin alphabet ones.


I thought wbl meant something to do with the combination of Korean and UTF-8, whereas jamo composition is no more related to UTF-8 than any other unicode encoding.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: