Show HN: Neural Net Writes Fake Kanji in Your Web Browser, Stroke-By-Stroke

tdeck · on Jan 6, 2016

This is really interesting, and it reminds me of the Cangjie input method for Chinese characters https://en.wikipedia.org/wiki/Cangjie_input_method.

It was designed to run within the disk+memory limitations of an Apple II where they couldn't store a full character dictionary. Instead, it broke the characters up into a few fundamental shapes and composition rules, and generated the graphics on-the-fly. The encoding of a character was the set of keystrokes that signified how to compose it, which meant you could "invent" new characters by just typing random strings.

hardmaru · on Jan 6, 2016

It's very interesting you bring this up, Canjie (倉頡) was the first Chinese input method I learned, and I didn't know it was designed to run within disk+memory limitations. To this day, I think it's still the best input method, and the preferred choice for many professionals in the publishing business in Taiwan.

For example, if I wanted to type the character 森, which means forest, I just type three tree's 木木木

The logical ordering flows very well as well. for example:

door: 門＝日弓

question: 問＝門+口＝日弓口

What I find interesting with the LSTM+MDN neural network was that it has no ideas about the concepts of the radicals themselves, and has to come up with this concept, and then an even more abstract concept of combining radicals to form a Kanji.

I wonder if we can use some sort of sparse-encoder technique with neural nets to make a more efficient version of Cangjie (which was designed using human heuristics), like DVORAK vs QUERTY.

hardmaru · on Jan 6, 2016

A blog post on how this algorithm works:

http://blog.otoro.net/2015/12/28/recurrent-net-dreams-up-fak...