Hacker News new | past | comments | ask | show | jobs | submit login

This looks very much like Make Me A Hanzi (MMAH), which is exactly the same (Chinese) characters. It's just that Japanese knows those as Kanji.

https://github.com/skishore/makemeahanzi




(Caveat: there are a handful of cases where the Chinese and Japanese stroke orders differ, so take care treating these as equivalent.)


This. Traditional Chinese and Japanese are two seemingly similar but distinctly different things (the latter is forked from the former).

Simplified Chinese is a whole different beast of corruption that is a fork of Traditional Chinese but otherwise not similar to either.


Kanji were also simplified but not necessarily the same way as simplified Chinese. Simplified Chinese also sometimes uses 'old' characters.

So sometimes modern Japanese is actually similar to simplified Chinese, sometimes it is similar to traditional Chinese, and sometimes it is unique. There is no simple 'fork'.

For instance, 円 (yen) is simplified Japanese and uniquely Japanese. It used to be same as traditional Chinese 圓. In Chinese it was separately simplified twice to its current form 元. So when you see prices in 円 in Japan and 元 in China it's actually the same original character simplified differently.

Interestingly, traditional Chinese 國 was simplified by reusing an old character and is now 国 both in Japanese and simplified Chinese but not in traditional Chinese.

Why not.


Sometimes Blink is similar to WebKit, sometimes it is similar to KHTML and sometimes it is unique.

I think the fork analogy still holds, no one says forks can't have convergent evolution or cross pollination


To further complicate things, there are the 'kokuji" - characters that were invented in Japan, used only in Japan, and only have Japanese pronunciations - yet are still considered kanji ("Chinese characters)". Examples:

働 "work" 峠 "mountain pass"

https://en.wikipedia.org/wiki/Kanji#Kokuji


I think 'kanji' should be interpreted at large. This is the Chinese writing system and inventing new characters (which happens everywhere this writing system is used) add to the whole corpus of kanji/hanzi even if some are invented or used in specific countries.

峠 has a mandarin pronunciation apparently: https://baike.baidu.com/item/%E5%B3%A0/4336929


>峠 "mountain pass"

Incidentially, the backstory behind that kanji is hilarious.

The kanji is composed of the kanjis for "mountain" on the left side, and "up" and "down" on the upper right and lower right sides respectively.

You know what a mountain pass does? Go up and down a mountain.


This is how a lot of kanji are formed. For example 町 (town) is 田 (rice paddy) + 丁 (street). I guess at some point in language formation a lot of towns were primarily collections of rice paddies.


働 exists in Chinese too, meaning "labor" 峠 does not exist in Chinese although many Chinese dictionaries (most notably the CC-CEDICT) include it


Definitely not exactly the same. Kanji and Hanzi are two different character sets - they overlap a lot, but each has common everyday characters that aren't in the other, and sometimes the "same" character is in both sets but written differently in various languages (e.g. 骨).


In case anyone is wondering why different glyphs have the same unicode code point, and how an app is supposed to decide which one to render... Well I don't know the reason for the first question actually, though many people appear to have some choice comments.

But as for the second question: for HTML documents, many tags have a lang attribute that decide which version of the glyph to render within that tag. Hacker News has lang="en", so it'll use a user setting to decide. For example, in Firefox' about:config, there's a setting called cjk_pref_fallback_order. If e.g. ja comes first, the little square inside the top square in 骨 is rendered on the right side, if any zh thing comes first, it's rendered on the left side.


> In case anyone is wondering why different glyphs have the same unicode code point, and how an app is supposed to decide which one to render... Well I don't know the reason for the first question actually

https://en.wikipedia.org/wiki/Han_unification

My understanding is that this is basically "white guy says all Asian writing looks the same" in standards form and is largely regarded as a terrible idea.


Unicode had a builtin language tagging system to resolve glyph variants. Han unification was implemented with this in mind. Then the tagging got deprecated in a later version.


The more I learn about unicode, the more it looks like Bad Ideas: The Standard to me. The only good part of it is the UTF-8 encoding and that was just Thompson and Pike sitting down and thinking about the problem for an hour.


UTF-8 is just a simplified improvement on another variable encoding. They didn't conjure it from nothing.


the lil inside square is inconsistent in chinese

For instance traditional chinese in china will be left 過. Most computer systems will type this one

But in Taiwan they do right side. That said, i dont entirely understand how it works. You cant even copy\paste the right hand version into this comment box for instance- but you can see it on wiki. Maybe theyre separate fonts? Really not sure. Maybe somebody knows better

And simplified is entirely different 过


> This looks very much like Make Me A Hanzi (MMAH), which is exactly the same (Chinese) characters. It's just that Japanese knows those as Kanji.

Well "kanji" literally means "Han character".


That's a neat project. While they are extremely similar, there are still many variations. For example one small variation is 今 is written with a horizontal stroke in Japan but a slanted stroke in mainland China.

https://en.wiktionary.org/wiki/%E4%BB%8A#Alternative_forms

https://github.com/KanjiVG/kanjivg/blob/master/kanji/04eca.s...

https://makemeahanzi.herokuapp.com/#/codepoint/20170




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: