The Forgotten History of Chinese Keyboards

chai2010 · 2024-06-03T03:13:00.000000Z

Very good article, like it.

Chinese characters are a type of pictographs that have some characteristics of QR codes. In fact, there is indeed a word retrieval method called four-corner number, which quickly maps Chinese character graphics to four numbers through some simple formulas, which is especially suitable for one-way encoding and retrieval. For example, the four-corner number of "龍" is coded as 0121, and the code of "兲" is 1080 (please refer to https://github.com/chai2010/im4corner).

In addition, Chinese characters are actually more important as hieroglyphic shapes. For example, we have a "凹语言" (Wa-lang https://github.com/wa-lang/wa/ ) designed for WebAssembly (WASM for short, WebAssembly => WASM => Wa), in which the Chinese characters "凹" and WASM The logo is very similar, and there was even a pronunciation of "wa" in the past.

After the popularization of computers, the function input method has been greatly improved, but there is still a lot of input resistance. For example, in programming, frequent switching between Chinese character names and English keywords brings a loss of input efficiency. As a programmer, I hope Chinese users can continue to pay attention to and improve these in the future.

mjklin · 2024-06-01T20:14:22.000000Z

No mention of the competing MingKwai typewriter of Lin Yutang, the famous popularizer of Chinese culture to the west. Apparently his prototype suffered an embarrassing failure at an investor meeting and couldn’t get off the ground. But the idea was good. Article here: https://thereader.mitpress.mit.edu/the-uncanny-keyboard/

omoikane · 2024-06-02T18:48:00.000000Z

Lin Yutang had a few patents related to this typewriter, I believe this is the main one:

https://patents.google.com/patent/US2613795A/

The idea of searching for characters via parts is similar to how Cangjie input method selects characters from radicals. I read somewhere that Cangjie input method was indeed inspired by Ming Kwai typewriter, but I can't find the citation for it.

upon_drumhead · 2024-06-01T20:11:53.000000Z

Radiolab also did an excellent podcast on the topic

https://radiolab.org/podcast/wubi-effect

grendelt · 2024-06-01T21:47:03.000000Z

That one was so good. I was completely ignorant of the topic before that episode aired.

asdasdsddd · 2024-06-01T21:06:47.000000Z

> like Chairman Mao Zedong, who seemed to equate Chinese modernization with the Romanization of Chinese script

One of Mao's better ideas

DiogenesKynikos · 2024-06-01T22:11:34.000000Z

Romanization of Chinese writing was already proposed during the New Culture movement in the 1910s-20s. China's most famous modern writer supported it.

However, the Chinese language has evolved alongside the characters for about 3000 years, and it's very difficult to just separate the two. A huge amount of culture is bound up with the characters. Not only that, but the Romanized writing system is viewed as something that only little children use (as an aid to learn the characters). Once you've put in the effort to learn the characters (as about a billion people have), it's very difficult to accept their replacement by what is viewed as a script for children.

mchaver · 2024-06-01T22:26:59.000000Z

The nice thing about Chinese is information density of writing. Something nice about seeing how much information can be squeezed into a small space. Feels like you front load more on the learning side, but get rewarded when reading and scanning texts. Not sure how much scientific evidence is behind that, just an anecdotal observation. Relatively few Chinese speakers want to give up characters.

xanderlewis · 2024-06-01T23:33:31.000000Z

I’m not sure how much evidence there is for that either — a Chinese friend couldn’t believe that I could just look at a paragraph of English and instantly know roughly what it was about; she, despite her fluency in written English, thought only Chinese characters would allow for such rapid comprehension.

It’s certainly denser, though. And I agree about the front-loading of learning. It’s like learning vi. An absolute pain at first, then very comfortable.

rjh29 · 2024-06-02T04:20:39.000000Z

I don't think (for me) chinese or english reading is particularly different. In both cases you're scanning whole blocks (words, phrases) at a time. Sometimes I feel like I read Chinese slower purely because of how dense it is.

xanderlewis · 2024-06-02T14:10:01.000000Z

Yeah. That was pretty much my point — no native speaker is even looking at each letter (or even each word), wnlch js wzy yxu cyn upigemqand thws siktsmce wjtdut mnrh of a ptublim. Each word is its own shape, much like how Chinese speakers aren’t looking at each stroke.

mchaver · 2024-06-02T15:05:45.000000Z

Much like in English you can pull out lots of vowels and still read the text, you can cover up the about bottom 40% of the characters in a sentence and still read it.

Terr_ · 2024-06-01T23:14:11.000000Z

> information density of writing

I feel like a proper comparison would not be number of characters, but a kind of pixel-budget, assuming both meet a certain reading speed and accuracy rate.

user982 · 2024-06-01T23:35:39.000000Z

I was reading a Wikipedia page (https://en.wikipedia.org/wiki/Twelve_Metal_Colossi) and was struck by the difference in length of the Chinese quotes and translations. E.g.:

  收天下兵, 聚之咸陽, 銷以為鍾鐻金人十二, 重各千石, 置廷宮中. 一法度衡石丈尺. 車同軌. 書同文字.

was translated into

  He collected the weapons of All-Under-Heaven in Xianyang, and cast them into twelve bronze figures of the type of bell stands, each 1000 dan [about 30 tons] in weight, and displayed them in the palace. He unified the law, weights and measurements, standardized the axle width of carriages, and standardized the writing system.

yongjik · 2024-06-02T02:27:44.000000Z

I don't speak Chinese, but my understanding is that it's not a totally fair comparison: classical Chinese text was often highly abbreviated, to such a degree that you have to be an expert historian to interpret it correctly.

For example, the characters comprising your example text starts like:

collect (收) [from] [all] soldiers (兵) under the sky (天下), gather (聚) at(?) (之) Xianyang (咸陽), melt (銷) and (以) become(?) (為) bell-stand (鍾鐻) metal (金) person (人) twelve (十二) ...

As you can see, the English "translation" is more like an annotated translation. E.g., the original doesn't say who did it, or what he collected from soldiers: we just inferred "weapon" because what else could be melted into statues?

Similarly, "standardized the axle width of carriages" is just: cart (車) same (同) axle width (軌). We're supposed to infer "standardized" because we are talking about the Emperor's deeds.

inkyoto · 2024-06-03T01:01:28.000000Z

Classical Chinese (Ancient or Old Chinese – multiple terms are used), the language the quote was written in that predates Middle Chinese and, by extension all modern Chinese languages, had a very different grammar with many features of it having all but disappeared from all Chinese languages. Classical Chinese texts are incomprehensible to a modern Chinese person who has not invested sufficient amounts of time and effort into completing Classical Chinese studies first.

There is a book, «Classical Chinese for everyone: a guide for absolute beginners» by Bryan W. van Norden that is easy to read and gives a gentle introduction into Classical Chinese.

The old grammar and vocabulary coupled with the Chinese style of writing metaphorically with an abundant application of allusions and with the same Chinese characters having multiple unrelated meanings, makes Ancient Chinese texts very terse and notoriously difficult to understand even for the educated Chinese people.

cwilby · 2024-06-02T00:27:57.000000Z

I just started learning Chinese about 2 months ago, to me it seems they stuff whole concepts into characters.

For example,

"去" (pronounced "Qú") is "going to the". "超市" (prounced "Chao Shi") is "supermarket" "去超市" (pronounced "Qú Chao Shi") is "going to the supermarket".

3 syllables vs 7 syllables.

To me, it seems that instead of composing letters into words to convey meaning, they have more letters that are mini-words unto themselves.

mook · 2024-06-02T04:37:59.000000Z

Don't forget all the abbreviation. "超市", supermarket, is abbreviated from "超級", super, and "市場", market. The equivalent in English would be "sup-mark" or something along those lines. (Or in Japanese, just "super".)

Terr_ · 2024-06-02T08:05:49.000000Z

Since we're now talking about verbal rather than written:

> No matter how fast or slow, how simple or complex, each language gravitated toward an average rate of 39.15 bits per second, they report today in Science Advances.

-- https://www.science.org/content/article/human-speech-may-hav...

cwilby · 2024-06-03T16:41:36.000000Z

This tracks - it's difficult to speak at the same pace in Chinese as I can English. That said - are those 39.15 bits plaintext? Compressed? Encrypted?

The size of a word does not correlate with it's concept - I still posit that some languages can transfer concepts faster than others, minus our baud rate.

Edit: Or, perhaps I am not as gifted an English speaker as my bias has presumed :| For example, I had to lookup "syntagmatic".

shxdsg · 2024-06-10T05:33:18.000000Z

Actually “去” is pronounced “Qù”

cwilby · 2024-06-11T17:34:19.000000Z

Thank you

kalleboo · 2024-06-02T04:26:23.000000Z

> However, the Chinese language has evolved alongside the characters for about 3000 years, and it's very difficult to just separate the two. A huge amount of culture is bound up with the characters.

How did that work out for Korea when they switched to Hangul?

alexlur · 2024-06-02T04:39:59.000000Z

They are not comparable. The Chinese script was tailor-made for Chinese languages, while it was simply adopted by the Koreans, which arguably was a bad fit because it’s 1) agglutinative and 2) not even a Sino-Tibetan language. Even then hanja is still part of the national education curriculum today (look up 한문 교육용 기초 한자).

numpad0 · 2024-06-02T07:22:13.000000Z

Prewar Korean written script used Japanese style Kanji for nouns intermeshed between Hangul phonetics. Postwar, under US influence they transitioned into all-Hangul phonetic language, but IMO it looks a big regression in their communication ability due to resulting arrays of pure homonyms.

They rely purely on context to distinguish {"apples", "apologies"}, {"mayor", "market"}, {"stomach", "ship", "pear", "double"}, {"acting", "delays", "smoke"} so on and so forth if what I'm scrolling is right. There's no tonal or character distinction. That surely isn't great.

int_19h · 2024-06-02T20:10:05.000000Z

Pure Hangul was used for a long time before then, just not in any kind of official capacity after Sejong. But e.g. most "women's literature" would be written in it.

And back when it was first introduced, it certainly did wonders for literacy. Although it should be noted that original Hangul was more phonemic wrt its contemporary Korean, and the letter shapes were a bit simpler as well.

yongjik · 2024-06-02T18:30:53.000000Z

I don't know where you got the idea of "under US influence", but mixed Korean/Chinese character writing was common in South Korea well into 1980s, long after Korea became its own country. For example, in 1987, the newly founded Hankyoreh newspaper made a splash by deliberately writing all articles in pure Korean script, which was not the norm until then.

Gradually more books and newspapers followed suit, because pretty much everybody found that writing everything in Korean letters actually make communication less ambiguous and easier to understand. If your phrase is ambiguous between whether someone's offering apples or apologies, then you just change the word or add additional context to make it clear which one is being offered. It's no different from how English speakers deal with bear/bear, tear/tear, arm/arm, ground/ground, and so on.

sho_hn · 2024-06-02T01:26:25.000000Z

Here is wonderful article by John DeFrancis on the topic:

The Prospects for Chinese Writing Reform (2006)

https://sino-platonic.org/complete/spp171_chinese_writing_re...

It is cited frequently.

asdasdsddd · 2024-06-01T22:15:13.000000Z

Almost all digital communication is written using pinyin, which today is almost all written communication

alexlur · 2024-06-01T22:22:12.000000Z

This is an extremely mainland-centric view. Cangjie is the dominant IME in Hong Kong.

asdasdsddd · 2024-06-01T22:24:46.000000Z

That's why I said almost all

alexlur · 2024-06-01T22:34:57.000000Z

It’s only almost all if you only interact with the millennials or younger. Pinyin is an IME for Mandarin. If you aren’t fluent in Mandarin, chances are you use voice input or stroke typing.

causality0 · 2024-06-02T01:08:19.000000Z

Why shouldn't it be mainland-centric? Mainland China is 99.5 percent of the population of China. That's like refuting a claim about Americans by calling it "a very non-Pennsylvanian view".

alexlur · 2024-06-02T01:34:56.000000Z

Because China is not the only place where Chinese languages are spoken. There’s more than 10 million ethnic Chinese in Southeast Asia alone. And it’s not only a mainland-centric view: it’s a mainland–Mandarin speaking centric view.

DiogenesKynikos · 2024-06-02T07:06:24.000000Z

Pinyin is used as input to select characters, but the final text that's used to communicate is composed of characters.

alexlur · 2024-06-01T21:44:16.000000Z

Thank God it didn’t happen.

z2 · 2024-06-02T03:42:58.000000Z

Much of the simplification adopted shorthand already in common use, which is why Japanese shinjitai simplification independently arrived at many similar characters and patterns. The second simplification round was an abysmal newspeak-esque failure, and thank goodness _that_ wasn't adopted either.

asdasdsddd · 2024-06-01T22:11:47.000000Z

pinyin is the best thing that happened to the language after simplification.

Not only did it propel literacy rates to basically 100%, but it added a phonetic component to the language

alexlur · 2024-06-01T23:06:35.000000Z

Again, this is a very mainland-centric view. Hong Kong has never simplified their writing system or even developed a proper romanization, and yet has consistently one of the highest literacy rates in the world. Guess what helped literacy? Post-war socioeconomic development like poverty reduction, mass education and industrialization.

> it added a phonetic component to the language

Fanqie has been a thing since the 2nd century. Zhuyin was invented in 1913.

charlieyu1 · 2024-06-01T23:47:54.000000Z

Agreed. I have seen kids from mainland China spending lots of time learning pinyin while kids from Hong Kong at the same age can already write some characters and pronounce the words accurately

numpad0 · 2024-06-02T03:08:43.000000Z

Simplification is just bad. It removes too much that it breaks ability for non-speakers to infer meanings. Complexity of letter shapes is irrelevant to ease of use in computer usage, so it's just a massive loss.

fjdjshsh · 2024-06-02T20:59:04.000000Z

>it breaks ability for non-speakers to infer meanings

Not sure what you mean by this. Do you mean that it's less convenient for people that don't speak / read Chinese? Why would that be a relevant metric?

You may be missing that character standards have changed over time and that different writing styles (草书，行书) are implicitly simplifications. You can think of latin or Russian cursive as a simplification of the printed letters.

In practice, the phonetic component has been mangled / evolved over time, so simplification doesn't make things more or less difficult for students (be it 5 year old native speakers or 50 year old non native speakers).

rurban · 2024-06-02T13:12:40.000000Z

Worked out excellent for Korean (Hangul) though. Also English.

Both massive wins

numpad0 · 2024-06-02T15:50:16.000000Z

I don't think it did for Korean, though I need input from speakers to be sure. From my experience, Korean MT routinely stops halfway through inputs and dumps nonsensical phonetic transcripts, likely from failing to identify words. I suspect they were just being complaisant to American influence in postwar years. Computers failing to even isolate and match words in this day and age is not a sign of an excellent working script.

yorwba · 2024-06-02T18:15:50.000000Z

Translation needs phonetic transcription to handle proper names. If there are words that may or may not be proper names depending on context, machine translation will guess the context wrong at least some of the time and phonetically transcribe what should be translated, or translate names that should be transcribed.

The problem also can also happen when translating from English, if you think about all the surnames that are occupations, or names like "bill" or "lily." Capitalization usually helps disambiguate, but there's title case and all caps and people who never capitalize anything...

numpad0 · 2024-06-03T08:36:48.000000Z

It's not just proper nouns. Korean MT seem to routinely "de-synchronize" into wonbonhangugeotegseuteububun mid-sentence and sometimes comes back in sync, sometime stays out of sync until the end of the sentence. it happens way more often than average with the Korean language.

yorwba · 2024-06-03T15:01:17.000000Z

Do you have an example input where that happens?

DiogenesKynikos · 2024-06-02T19:42:05.000000Z

Simplified Chinese characters are already difficult enough for foreigners to learn. Making them learn traditional characters would just be sadistic.

numpad0 · 2024-06-03T14:33:28.000000Z

Traditional characters is built on common parts for pronunciation and meaning cues. Simplified removed that so IMO it compresses worse and therefore harder. It's visually less dense, but, so what.

DiogenesKynikos · 2024-06-03T14:54:12.000000Z

Those cues are there in exactly the same way in most simplified characters.

The cases where simplification has removed those cues are rare enough that the extra complexity of traditional characters is really not worth it.

I've never heard anyone claim that simplified characters are more difficult to learn, and it just seems false to me.

ogurechny · 2024-06-02T03:57:34.000000Z

“Literacy rate” is just a bureaucratic index. It was increased in most countries with mostly the same measures, no matter which their writing system was. If look closely, “literacy” meant “making mass of workers and soldiers capable of following basic instructions”, and there often was not much for them to read except for parroted propaganda (obviously, I'm not talking about China specifically, as it has been the same everywhere).

tengwar2 · 2024-06-03T22:42:57.000000Z

Phonetics can be counterproductive to comprehension, or converting meaning to text. Take an example much closer to English: Scottish Gaelic, which is written with the Latin alphabet. It's considerably older than English, has more distinct consonants and vowels, and it is really difficult to guess the pronounciation from a written word if you only speak English (unlike Welsh, which has nice orthography and is easier than English in that respect).

Because of these difficulties, there is a long tradition of anglicising names of settlements to meaningless collections of letters which when read by an English speaker approximate vaguely to the original Gaelic name. Unfortunately this is not a reversible process - you can't look at a modern anglicised name and guess what the Gaelic is, in general.

Now while Gaelic has a tiny population of native speakers, there are millions of people who know some "map Gaelic" - that is, we can look at a map with Gaelic place names, and understand the elements. It doesn't work for towns and villages, but generally in the north, no-one bothered to anglicise the names of natural features, just the settlements - and walking is the most popular outdoor recreation in the UK, so we learn this when we read maps.

When the first SNP government of Scotland came in, they introduced bi-lingual road signs, even in areas where Gaelic is no longer spoken. There was and is complaint over this, but I found that things became much clearer. I could look at a placename like Machrihanish, and see that it is Machaire Shanais. I still don't know what Shanais means, but Machaire is a type of landscape that I know, so I immediately know that this is low-lying and grassy, and fairly level. I can do this for thousands of place names without being able to reliably tell how to pronounce the words - similar to the way that the pronounciation of a word indicated by a Chinese character can vary widely with the part of China, so that the pronounciation becomes quite secondary to communication.

vunderba · 2024-06-02T01:18:18.000000Z

Uh... no. Bopomofo which is used in Taiwan is a phonetic script that is used as a popular IME.

And simplification's only "arguable merit" is that it saves a fortune in ink at the expense of losing its historical roots. But guess what? We mostly use computers now. So great job Mao, now we have two competing standards. (Nod to XKCD).

Unrelated but to those of us who started with 繁體字, simplified just looks ugly. (龙 vs 龍)

iforgotpassword · 2024-06-02T19:10:02.000000Z

Sure traditional looks nicer, but holy fuck is writing it (by hand) ever annoying. When I asked friends who grew up with the traditional characters about it they said a lot of people use some form of simplification when taking notes or leaving messages for friends/family. People from mainland seem to only shorten words by omitting characters of longer words, if at all.

And about losing the historical roots, I guess if you're interested in it, the characters will always be there and accessible for you to study. I'd be interested how much the average Joe from Taiwan really remembers about random characters' roots, composition and meaning. I know much more people from the mainland, and among them are people who don't give shit, and those who can also write a lot of traditional characters and give lectures about the origin of meaning of some character and whatnot.

Also, since this is about computers after all, I've seen a study a while ago about from mainland where they tested how many mistakes people make writing less common characters. There was a bar chart that went down between 10 and 20ish, then went up a bit and started to go down again at around 30. It was speculated that people in school still have to write a lot by hand, and during/after college that stops and everything has been digital for a decade now so people just forget again, but folks old enough to have used pen and paper for a couple decades just had enough practice. I wonder if this effect would be more or less pronounced with traditional characters.

rjh29 · 2024-06-02T04:15:17.000000Z

I feel like Japanese strikes the right balance, no ugly oversimplified characters but making common kanji easier to write for children (國→国、櫻→桜）

For example 竜 is a fairly common simplification of 龍 and imo not nearly as ugly

z2 · 2024-06-03T04:23:56.000000Z

There are some strange-looking ones too (圖-図、圓-円), but agree that overall it was lighter touch. I think all simplification projects have an inherent awkwardness in taking handwriting shorthand or cursive and trying to reformalize it back to print. In any case it's a shame that there was no coordination due to obvious geopolitical conflicts, that we're now left with 3 sets. It was easier last time, 2.2k years ago when some dude took over all places that wrote Chinese characters and forced a single way of writing :)

Vt71fcAqt7 · 2024-06-02T21:41:16.000000Z

Yeah except hiragana and especially katakana both look ugly though.

otabdeveloper4 · 2024-06-03T10:59:57.000000Z

> simplified just looks ugly

I prefer simplified for the aesthetics alone. Traditional is cringe and ugly in typed form.

wolfgangbabad · 2024-06-01T21:46:01.000000Z

Vietnamese is relatively OK.

alexlur · 2024-06-01T21:56:22.000000Z

Chữ Nôm is a borrowed writing system and not native to Vietnamese, which isn’t even a Sino-Tibetan language to begin with.

gumby · 2024-06-02T02:37:05.000000Z

Latin is a borrowed writing system not native to English, German, Polish and many others which aren’t even Romance languages to begin with and must resort to di- and trigraphs plus non-Latin characters like J, V, ß, ł or å, among others (not to mention diacritics).

DiogenesKynikos · 2024-06-02T20:00:01.000000Z

Alphabets are much more flexible than the Chinese characters.

An alphabet can be adapted to basically any language. You just have to map the letters to the sounds, and you're pretty much done.

By contrast, the Chinese writing system is adapted very specifically to the properties of Chinese language. Every syllable in Chinese has a meaning (or set of meanings), so every character represents one meaning (or a few). English does not have that structure: words can have very arbitrary syllables that don't have any meaning on their own. Chinese characters encode a meaning plus a sound, which is often reflected in how they're composed (i.e., a character will often be composed of two simpler characters, one of which has the correct meaning and one of which has the correct sound). Chinese words do not change form: there's no conjugation, no plural form, etc. As a consequence, the writing system has no way to deal with things like conjugation.

I have no idea how one would even begin trying to adapt Chinese characters to write English. On the other hand, it's relatively easy to come up with a way to write Chinese in any alphabet.

int_19h · 2024-06-02T20:02:02.000000Z

"Å" is just "O" stacked on top of "A" though. And "V" is in fact the OG Latin form ("U" is the newly introduced one).

But yeah, the whole notion is kinda silly. Most writing systems in the world are developed from very few originals. E.g. for most of Eurasia, the source is either Egyptian hieroglyphs or the Shang Oracle bone script.

acwan93 · 2024-06-01T21:56:19.000000Z

Relatively. The amount of diacritics on Vietnamese surpasses European languages so text rendering becomes a challenge if a naive developer doesn't test with Vietnamese.

numpad0 · 2024-06-02T00:59:56.000000Z

Is bringing back Chu Nom script going to simplify Vietnamese support on computers by a lot? It's unintelligible to CJK users, but as far as text rendering goes, it seems just simple Kanji/Hanzi.

publicola1990 · 2024-06-02T18:53:25.000000Z

The Vietnamese romanized their writing, they seems to be doing fine.

alexlur · 2024-06-03T02:27:54.000000Z

This isn’t factually correct. The French colonial administration romanized their writing and enforced chữ Quốc ngữ.

geraldwhen · 2024-06-02T01:30:55.000000Z

I find it humorous that 鷹 was described as a difficult character in the article. It’s like 3 radicals and the character for bird.

Perhaps it’s difficult to render in tiny Latin alphabet font, but if you have any Japanese or Chinese study under you, you could read and reproduce that nearly instantly on sight.

surrTurr · 2024-06-02T17:42:14.000000Z

i think he meant difficult in the sense that it consists of many strokes, not in the sense how difficult it is to remember. however, one could argue that there are many other, more complex to write, kanji than 鷹

mrweasel · 2024-06-02T17:43:32.000000Z

It interesting to consider that both Japan and China might have been prevent from ever being first with general purpose computers. ASCII, and other encoding schemes, only needed to make provisions for less than 200 characters, making it possible to implement with the limited storage and memory available to early computers. The shear amount of characters in some languages, like Chinese may have served as a distraction or roadblock for early computers in the those countries.

_xnmw · 2024-06-02T18:08:26.000000Z

Makes you wonder what limitations of our own language and culture are preventing us from inventing certain things?

raincole · 2024-06-02T20:26:48.000000Z

In an alternative history timeline it might be true.

In our timeline I highly doubt whether it was the main reason why general purpose computers didn't happen first in China or Japan.

dang · 2024-06-01T19:41:48.000000Z

See also "How the quest to type Chinese on a QWERTY keyboard created autocomplete": https://www.technologyreview.com/2024/05/27/1092876/type-chi...

(via https://news.ycombinator.com/item?id=40548356, but no comments there)

acheong08 · 2024-06-01T21:52:39.000000Z

“Safari can’t open the page because the address is invalid”

How strange.

More on topic: Considering how inefficient Chinese characters are in general (but especially evident in computing) as one of the few languages where characters have no direct relation to phonetics, I wonder why there hasn’t been an effort to modernize it similar to Hiragana in Japan. Well, considering how Chinese is basically Kanji, why not just adopt Japanese?

canjobear · 2024-06-01T22:43:01.000000Z

There were various attempts to develop an organic phonetic writing system for Chinese, like hiragana for Japanese, for example Bopomofo (still used in Taiwan) and General Chinese (https://en.wikipedia.org/wiki/General_Chinese). The Simplified characters that you see on the mainland today were originally part of a multi-phase scheme to eventually replace characters altogether, but the second phase (https://en.wikipedia.org/wiki/Second_round_of_simplified_Chi...) was bungled so badly that it didn't continue. In practice Pinyin is the standard phonetic writing now and is used when people can't remember a character.

alexlur · 2024-06-01T22:03:27.000000Z

> how inefficient Chinese characters are in general (but especially evident in computing)

We are not in the 90s anymore. UTF-8 has been around for 32 years now. If you’re working for a system that has no UTF-8 support, you have a much bigger problem to worry about.

> characters have no direct relation to phonetics

Most characters are phono-semantic where one part of the character is a phonetic hint and the other is a semantic hint.

> modernize it similar to Hiragana

Hiragana isn’t and wasn’t intended to replace kanji (unless you are from the fringe Kanamozikai). It serves a different grammatical purpose and is complementary to the other two. Kana is useful for an agglutinating language like Japanese, but not Chinese languages.

numpad0 · 2024-06-02T01:19:29.000000Z

I think one of statements with respect to CJK languages that has to be made more often is that each of the languages has own numerous dialects with dubious mutual intelligibilities, e.g. Tsugaru and Kagoshima dialects against standard Japanese.

The phrase "a language is a dialect with an army" often appears in topic of Asian languages, and causing frictions between CJK non-speakers wondering about compatibilities between the three and speakers showing near vile dissents to those questions. While I understand both sides of these sentiments, the situation is not ideal for both sides.

IMO, it might be weird to refer to these languages as "Beijing Tokyo Seoul" languages, but doing so occasionally(just occasionally) could create more tangible feel as to why these three seem to exist side by side so utterly disconnected against each others.

shiomiru · 2024-06-01T22:16:53.000000Z

> Kana is useful for an agglutinating language like Japanese, but not Chinese languages.

FWIW, the Japanese did develop a kana-based system for Taiwanese during the occupation, but it was an abomination.[1]

[1]: https://en.wikipedia.org/wiki/Taiwanese_kana

faitswulff · 2024-06-02T00:51:05.000000Z

There are a lot of underlying assumptions here:

1. That Chinese writing is inherently inefficient. It's actually very efficient...to read. And nothing beats the efficiency of having a script that maps perfectly to the language. Also as sibling comment notes, UTF-8 is a thing.

2. That there is no relation between written characters and phonetics. Incorrect, as several sibling comments point out.

3. That Japanese kana represents a successful "modernization" of kanji that Chinese should emulate.

4. That Chinese is "basically kanji" - assuming the Chinese and Japanese languages are essentially interchangeable. They...are not. I can't even begin to emphasize how much they are not. Chinese is subject-verb-object while Japanese is subject-object-verb, for instance. Chinese also has many phonemes that are incompatible with Japanese, which would not be covered in hiragana. Finally, kanji came from Chinese and has subtle differences and while it is mostly a subset of Chinese hanz, it has its own slightly different character set

numpad0 · 2024-06-02T08:42:58.000000Z

GP is making understandable misunderstanding due to how the three Far East countries are presented in the world at large, that there are three countries in Asia that practically touches each others, just like Germany is with Belgium and Netherlands in Europe.

Tokyo from Beijing(2000km/1200mi) is about as far out as Paris to Kyiv. Far East countries are also separated by seas, like Mediterranean countries across the sea. I doubt a lot of Parisians have meaningful ideas of "basically Latin" Ukrainian any way or form, or Italians with Tunisian, but there's such false instinct that forms out of above-mentioned presentation that those Asians are rather next door neighbors.

That and mistaking personal difficulties and inefficiencies associated with understanding languages in non-native manners as inferiority of the foreign one.

alexlur · 2024-06-02T01:25:02.000000Z

It’s really bizarre to see someone claim kana has anything to do with “modernization”. The Japanese modernization and industrialization period is famously associated with translating Western concepts and terminologies into Sinitic words that later spread to China, Korea and Vietnam.

rjh29 · 2024-06-02T04:23:48.000000Z

That was true like 100 years ago, but nowadays katakana words are extremely popular and increasingly used over their Sinitic counterparts, so I feel it's a valid argument.

Also it's not uncommon for words like ろ過（濾過）to be written in part kanji especially in news... if that trend continues beyond the 常用 kanji we might end up with a Japanese that is closer to Korean.

alexlur · 2024-06-02T04:57:12.000000Z

The modernization argument only makes sense if your society is economically or militarily inferior to the society you want to emulate. It was the case 100 years ago, but not today.

The Japanese economy has been stagnant for over 30 years with no end in sight. Following the same logic, Japan should perhaps “modernize” their language by following China, which is a ridiculous conclusion as you can tell.

numpad0 · 2024-06-02T00:50:30.000000Z

> why not just adopt Japanese

Because Japanese characters have no direct relation to Chinese phonetics. Both belong to different dialect continuums, phonetics aren't compatible.

And I suspect same might explain lack of native Chinese phonetic script; `Chinese` isn't a single spoken language, but what is called as such is its Beijing area version of one of Chinese(or Sinitic) languages. The written language was universally understood in China due to bureaucratic needs, but AIUI it's not same as spoken language and it's not necessarily used everywhere. Maybe they just had little uses for a standardized phonetic script?

1: https://en.wikipedia.org/wiki/List_of_varieties_of_Chinese

barronli · 2024-06-02T02:12:41.000000Z

It is still very useful to standardize the pronunciations, since people with different dialects had to meet especially those officials in government. There was “yayan” for this purpose.

https://en.wikimedia.org/wiki/Yayan

highwind · 2024-06-01T22:35:08.000000Z

I'm guessing you are not familiar with how Chinese characters work nor how Japanese Hiragana or Kanji work.

karma_pharmer · 2024-06-01T22:54:29.000000Z

This is not a helpful comment.

acheong08 · 2024-06-02T01:15:40.000000Z

Well obviously not. Posting a dumb question tends to return some very helpful responses

vunderba · 2024-06-02T01:34:31.000000Z

Non-native speakers who suggest that countries arbitrarily modernize or change their language remind me of non-musicians who come around with a new replacement for traditional sheet music. Even if it was a good idea, which in this case it's patently not, it's just not gonna happen.

It's a failure to recognize that languages (which I would rank music a kind of) evolve organically, and outside of some edge cases, like Esperanto, they're not artificially created in a vacuum.

sho_hn · 2024-06-02T01:36:43.000000Z

See https://sino-platonic.org/complete/spp171_chinese_writing_re...

tdeck · 2024-06-01T23:02:29.000000Z

> one of the few languages where characters have no direct relation to phonetics

nit: It's not accurate to say that the characters have no direct relation to phonetics. Thousands of them are semanto-phonetic compounds, meaning they combine a character relating to the word's (or syllable's) meaning with a character relating to pronunciation. Sinitic languages tend to have a lot of homophones or near-homophones, so this approach works reasonably well as a memory aid once you've memorized a bunch of the basic characters.

One problem is that many of the pronunciations have drifted from the Middle Chinese pronunciation of the words. Also, some of them have been simplified in Simplified Chinese which makes the components a bit harder to discern.

I've been learning some Cantonese recently and this is very apparent with certain common Cantonese words. For example, the first-person pronoun in Cantonese is pronounced ngo, with a low-rising tone, and written like this:

我 https://www.cantonese.sheik.co.uk/dictionary/characters/1/

The word for goose in Cantonese is also "ngo", but with a different tone. Here's the character for that:

鵝 https://www.cantonese.sheik.co.uk/dictionary/characters/1200...

If you enlarge it, you'll see that the left side is the same 我 from before. The right side is 鳥, which means "bird" (https://www.cantonese.sheik.co.uk/dictionary/characters/161/). So if you saw this character and knew the basic characters for the pronouns and the word "bird", and you spoke Cantonese, you'd be able to easily understand what it meant.

Here's another one. The word "ngo" with still a different tone means "hungry". How do we write it?

餓: https://www.cantonese.sheik.co.uk/dictionary/characters/740/

In this one the phonetic component is on the right instead, which is a bit inconsistent. The left side is this:

食: https://www.cantonese.sheik.co.uk/dictionary/characters/116/

What does 食 mean? It's the verb "to eat". So if you saw this 餓 character and knew a couple of other basic characters, you could figure out that it's the word "ngo6" meaning "hungry". Many of the characters still work like this although the sound shift I mentioned above means that some work in some Chinese languages and not others.

hker · 2024-06-02T02:04:43.000000Z

Native Cantonese speaker here, glad that you are interested in learning Cantonese.

I am working with other volunteers to improve Cantonese teaching, and wonder what difficulties you have encountered when learning Cantonese, and what materials or communities would be helpful for Cantonese learners.

tonetegeatinst · 2024-06-01T23:01:19.000000Z

Asianometry has a good video on this if my memory serves me right.