From my experience, this article brushes a very rosy picture of the Chinese language.
The bottom line is that even for a native speaker, if you encounter a new character, chances are you don't know what it means or how to pronounce it. If it's simple enough, you may guess, but that's it.
Learning Chinese means memorizing characters. A lot of them. It takes over a decade to teach a Chinese kid to even read a simple magazine.
Combine this with the use of tones, and you get a pretty complicated mess. You can't use tones to communicate things like irony, anger or doubt. You have to play on volume. You're losing a lot of expressiveness and subtlety in the process.
This is why building a search engine in Chinese is so hard. You can't apply a Porter stemming algorithm for example; there's no concept of stem or etymology (no apparently logical one at least).
Moreover, combining one character before (or after) another can change the meaning entirely. Adding a third one can again change things. Since you don't separate words with spaces in Chinese, good luck figuring things out programmatically. Things in Chinese are heavily relying on context.
Learning Chinese can be an interesting challenge, but understand that it will mostly rely on memorizing characters with little to no logic or trick to help you do so.
>> Combine this with the use of tones, and you get a pretty complicated mess. You can't use tones to communicate things like irony, anger or doubt. You have to play on volume. You're losing a lot of expressiveness and subtlety in the process.
This seems a bit like linguistic relativity. Native speakers don't have problems with tones. It is a challenge for second language learners of Chinese who don't speak a tonal language. Likewise, stress patterns in English are a challenge for second language learners but native speakers don't struggle with it.
There is a lot of word play in Chinese. Just because a syllable has a raising tone does not mean it is the exact same pitch as another syllable with a raising tone later in the phrase, rather the syllables follow the same pattern of starting from a lower pitch and go to a higher one. You can also make vowels longer or shorter.
>> Things in Chinese are heavily relying on context.
That's the case for most languages. What does 'hit' mean in English?
He hit me. Give me a hit. That show is a hit. He hit a tree with his car. I hit a new high. The boss ordered a hit on him. Tomorrow we will be hit by a storm. They hit it off well. She hit on him. That hits close to home. Let's hit the road.
> Likewise, stress patterns in English are a challenge for second language learners but native speakers don't struggle with it.
This really is key. "Tones" are hugely important in English. A simple sentence like "I didn't steal your money" has at least five different meanings depending on which word gets stressed.
Imagine trying to learn this stuff from scratch as a non-native speaker! How do you even teach that? Are there actual rules for it, or do you just have to figure it out?
At least with Chinese, tones are just part of the pronunciation. A particular word (syllable, really) has a particular tone, and that's it.
You don't have to put stress on any words on that sentence in English, and it will be sound perfectly fine. The vast majority of sentences don't stress particular words.
You can also add implications by the gestures you make in between words. That doesn't make English complicated, that's just an irrelevant thing you can do.
Calling such a thing 'hugely important' is just wrong. I didn't miss the point, I think the point has no connection with actual speech.
> This really is key. "Tones" are hugely important in English. A simple sentence like "I didn't steal your money" has at least five different meanings depending on which word gets stressed.
Is this in Chinese or English?
I speak Mandarin (a bit of a Northeastern Accent, which the following is based on) and English. The idea that Mandarin does not express meaning via tones of individual words (despite being a tonal language) is not correct. It is simple emphasizing or changing the tone of a word which connects with changing the tone of all of the other words in the sentence.
The example sentence:
The 'I' or forcefulness, emphasizing the tone to exaggerate a word (choosing to starting a sentence with 'I' already suggests a self-centric perspective). For example: Deepening the 3rd tone for 'I' would exaggerate self; Raising the 3rd tone for 'I' would raise self questioning (often followed by a 可 but not necessary in speech and written context) or self-perspective (no 可 just, a heightened tone, just like English, but adding a 可 in in a rising tone emphasise this a lot);
'didn't' 没 a third tone (down-up) but the level of the single down-up is indecipherable compared to the difference of down-up depending on context. Rising again suggesting innocence and question asking, to a complete deletion of the rising part for complete denial. 3rd tones are a pain as they traditionally get augmented by tones before and after them, but the differentiation between the three possessive 'de's also got eliminated from examinations a few years back (的 adjective possessive, 得 verb possessive, 地 place possessive - not sure if this is a good thing).
'steal' again could probably go all over the place (flat short tone trying, for example, a child denying stealing chocolate to sound cute) to a rising tone giving denial, to a falling tone signalling finality.
'Your' 你 or 你的. Without the 的 a definite difference in meaning. Without it, it is very informal. As a personal pronoun, all kinds of tricks can be performed in tone to turn it from prejudicial to friendly.
'Money' 钱 at this point in a sentence, especially one of intentional display of where stress is necessary, it doesn't matter. The tone of the sentence is already established by the combination of _augmented_ tones of individual words. No one speak to each other like a computer dictionary of perfect tones, just like in English no one does hand recognition like an Apple Newton in 1995.
Chinese is a tonal language. But it is one that's spoken in sentences. Sentences make sense, static words don't. The greatest frustration I sometimes have is not understanding emotion or meaning, but when I flat-out say a noun or verb incorrectly because I screw up the tones, and then get met with confused looks. Tones crossed over words in modern Mandarin sentences are extremely close to English sentence tones. Speakers can choose to avoid this by using traditional (only 30-40 years ago commonplace) but today outside of official function, a little older (50+), or deliberate use, they don't.
Sorry for the verbose answer, it's 2am here. I found I just had to "just have to figure it out" when landing on a plane a few years ago.
Sorry, I didn't write precisely and didn't mean to imply that Chinese didn't have similar sorts of stress/tone things as English. Definitely, both languages imply a lot based on how you speak a word. I really just meant to compare the relatively easy business of tones-as-pronunciation, as Mandarin has, with the much more difficult and complicated business of tones-as-extra-implications, which Mandarin, English, and probably every other language has.
Tones are to native English speakers what L's and R's are to native Japanese speakers. Because tones don't carry semantic information in English, we tend to ignore it, which leads to English speakers sounding really weird when speaking any Chinese dialect.
Unless we're talking about different things - tones carry semantic information, just not the same kind as in Chinese. They indicate things like "I'm asking a question" and "I'm being sarcastic".
This does make it hard for English speakers learning Chinese, but I don't think it's quite the same as L/R in Japanese.
Indeed. Chinese isn't really more difficult so much as it's different. And the difference causes people to think it's difficult, which then becomes an excuse for poor results when teaching/studying the language. Though I will say it tends to be poorly taught, so I guess it's more difficult in that sense (people will struggle with anything that's poorly taught).
And it's poorly taught because there is a lack of people who really master the two languages (Chinese + language X that is very different from Chinese), which is due to the fact that... ...the two languages are very different in the first place...
I don't think someone would need to be a master of both languages to greatly improve the way it's taught. For instance, one of the problems is that characters tend to be taught to foreigners in a completely bizarre manner; you end up learning many compound characters before you learn the component parts. Other issues include the small amount of Chinese in the course (lots of speaking in English about Chinese), the focus on things like grammar, and the idea that Chinese is so difficult that poor results are acceptable.
I've seen a number of people with a low-level in Chinese develop self-study plans that were much, much more effective than class or text book I've seen.
OP here. I think the reality is obviously more messy than what the article tends to portray. All natural languages are messy in their own idiomatic ways.
One of the goals of the article was to emphasize the structure in the Chinese language, because most people who are not familiar with it just assume it's just a big bunch of meaningless ideograms.
It's a "simple" introduction, not a thorough one. It's meant to tease people's curiosity.
If you start with a negative bias towards Chinese, you won't be able to see where it shines.
Things that you express with tones in English, can be expressed with particles in Chinese. They can also be expressed in other modulations at the sentence level.
Some subtleties of English definitely can't be expressed in Chinese, but the reverse is equally true. And that's the point I want to stress. Chinese is very different, and where there is difference, there is richness.
I notice however that you've taken a graphic from the front page of my website: http://www.hanzigrids.com (though interestingly, a slightly older version than the graphic currently there).
You're welcome to keep using it, but it would be nice if you could add some attribution.
It's something I've been planning to add for ages, unfortunately I have a number of other priorities at the moment which mean that work on HG tends to get put on the back burner.
As a Chinese, I can tell you that there are only 3000 common Chinese characters. My nephew begin to read news paper at grade 4 ( 10 years old). I begin to read very complicate novels at 13. A good thing about Chinese character is: I can easily read laws if I want, because there is not much new vocabulary to read.
Tone is not a problem for baby. They can grasp it from day one when they start to speak.( I have 3 kids)
There is a lot logic inside Chinese characters.
The problem with tone isn't so much that it's hard (not more so than, say, a voicing or aspiration contrast is), but that as people get older, it's harder to hear contrasts that don't exist in the languages you're familiar with.
I've been living in Vietnam for years now and watched many foreigners struggle to learn the language, which depends even more on tones than Chinese. Tonal languages are very unforgiving, and the differences between tones are very subtle in common conversation. Even when Vietnamese completely mangle the grammar and pronunciation of English you can usually understand what they're trying to say. But if you don't absolutely nail the tones and pronunciation of Vietnamese all you'll get are blank stares back in response. That is, if you haven't accidentally said something wrong or vulgar by mistake.
The saving grace of Vietnamese is that the grammar is trivially simple compared to English. No plurals, tenses, cases, verb conjugation etc. And it uses the Latin alphabet so reading and writing it is vastly easier than Chinese.
Sure the alphabet is a big plus for someone who doesn't know any Chinese (nor Korean or Japanese). In the contrary the old writing system (chữ nôm) could be extremely useful for learning vocabulary given how many words are from Chinese origin.
I was working with a Vietnamese-French dictionary (Đại Nam quốc âm tự vị hợp giải Đại Pháp quốc âm 大南國音字彙合解大法國音) and I was struck by how many words I can understand without reading at the definition because they were writing in Chinese characters; of course when it was nôm character this didn't happen. The reading of this words also weren't that surprising compared to Japanese On readings or Mandarin Chinese.
I don't speak any Chinese at all but people tell me that there are still a lot of similar words in modern spoken Chinese and Vietnamese. You can still see remnants of the old, pre-latinized writing system here.
as someone currently learning Vietnamese do you have any tips for learning the language? I've learnt some Mandarin so I am no stranger to tones - but the tones in Vietnamese seem quite different.
I've been using Pimsleur which I highly recommend, but I find I end up atuned to their particular voices and can not understand genuine Vietnamese speech. Also I am using memrise to learn vocabulary, but without example sentences and the like I find it of limited use until I find the same word in a real life context.
There's no substitute for one on one tutoring with a native speaker. Get somebody to drill you on the tones and the alphabet. Once you have those down you can work on vocabulary on your own.
If I'd done this when I first got here I'd be a lot further along now.
I just wanted to comment on your point about tones. I'm a pretty mediocre Chinese speaker, but I find it much easier to hear Chinese than English. At the level of full sentences, English is far easier for me, but at the word level the advantage is equal in magnitude favoring Chinese. I think it's because it's much easier to hear the voiced sounds that comprise the majority of Chinese, whereas in English a lot of the meaning is loaded into the high-pass unvoiced component. From the perspective of the listener, it seems to make much more sense to use the higher-SNR of voiced sounds to convey the bulk of information (as I believe Chinese does), rather than the much weaker unvoiced sounds (which English makes heavy use of.) I'm guessing the prominence of unvoiced sounds in speech owes to their being less effort to form, or to their universality across different people whose voiced sounds would have substantially different frequency components.
"You're losing a lot of expressiveness and subtlety in the process."
Well, er..., Chinese classic poetry is the quite subtle and expressive, I have heard.
And, not sure for your native language but for mine, if I read or hear a word that I do not know, I also have to take the dictionnary. Moreover, in many cases, I will not know how to pronounce a name I never heard before.
And, well, building a search engine is much simpler in Chinese, because characters do not change. So yes, there is (almost) no stemming, because you do not need to stem the words before indexing or searching.
Another point you did not mention: printing Chinese characters on a paper or a screen is much simpler: all of them have the same width and height. No kerning. No need for text justification (almost). Do you know that when preparing for a book in French the typographs have to make sure there is no "rivers" in the page and adjust word spaces accordingly?
I once built a Chinese search engine, which worked quite well. Stemming isn't an issue as Chinese has absolutely no inflection. The problem is that there are no spaces between words. To separate them, I used a dictionary and a greedy algorithm. At any point in the text, I simply found the longest word in the dictionary which matched and committed to that. On rare occasions, it would get it wrong but soon recover (e.g. if my next few characters are ABCDE and the longest matching word is ABC, I'd match that, followed perhaps by D and E, even if the correct tokenization is AB CD E). Without a dictionary you can't meaningfully tokenize Chinese.
> You can't use tones to communicate things like irony, anger or doubt. You have to play on volume. You're losing a lot of expressiveness and subtlety in the process.
That's not unusual even in languages without tones. For instance, in Irish, you can't use vocal stress to emphasise particular words like you can in English. Instead, you use a combination of changes in word order and emphatic affixes to convey the same meaning. No expressiveness is lost, it's just that Irish uses as different mechanism to convey the same meaning.
My understanding of Mandarin and the other Sinitic languages is similar: they might not use the same mechanisms to convey such information as in English, but they certainly can convey them.
> Moreover, combining one character before (or after) another can change the meaning entirely. Adding a third one can again change things.
vent
con-vent
con-vent-tion
Mandarin is no different than any other language in this regard. It's just that it's writing system masks what we'd consider words in languages with alphabet-like scripts. Also keep in mind that the Chinese characters make use of punning, so sometimes those weird character combinations are simply very, very old visual or phonological puns.
[There's a little bit of cheating in there, because the 'vent' in 'convent' is more like 'ven-t', but the principle still applies, and with some though I could find better examples. However, as mentioned, Chinese characters do rely on punning like this in places, it's not a huge flaw in the example.]
> You can't apply a Porter stemming algorithm for example; there's no concept of stem or etymology (no apparently logical one at least).
There are a few reasons for that, but the main one is that Mandarin is a largely isolating language. However, to apply some equivalent of Porter stemming, you need to rely on there being a concept similar to a word in the language and its writing system, whereas Chinese characters are more akin to morphemes, so the concept of 'stemming' makes no sense. Pinyin is another matter though.
> there's no concept of [...] etymology (no apparently logical one at least).
There is, it's just obscured by how divorced the written language is from the spoken language, which isn't the case for languages with alphabets, syllabaries, &c.
Yeah - my son has a hundred ways to say 没有 that are playful, angry, bossy, funny, etc. It is part of the language, you just use different vocal variations than the classic English ones like 'raise your tone to indicate a question', or 'use harsh dropping tones to indicate anger'
> The bottom line is that even for a native speaker, if you encounter a new character, chances are you don't know what it means or how to pronounce it. If it's simple enough, you may guess, but that's it.
Native speaker here, this is definitely true. On top of the characters
Also, children learn the phonetic alphabet[1] before learning the characters. That's why it takes so long to even read a simple magazine: no one uses the phonetic alphabet outside of lower level elementary school.
This article is like the "Learn to read Korean in 15 minutes"[2] but more like "Learn to read 0.01% of Chinese characters in 15 minutes", but interesting nonetheless
On the other hand, I would think that a language that uses tones has higher "bandwidth". When you use tones, you can use shorter words. With these shorter tone-words, you can "transfer" more information in less time, compared to a non-tonal language.
This is true. Vietnamese is tonal and mostly monosyllabic, for example. But it turns out that people tend to speak very information dense languages like this more slowly, so the overall information density is similar among languages. Maybe humans can only process information so fast?
Yes, I started learning a bit of Mandarin this year. Interesting, but tedious. Imagine they constructed their numeral system the way they constructed the (non)alphabet. Instead of 0-9, and all numbers constructed from those, there would be a million characters for counting to one million.
If you want to learn to write Chinese characters, I recommend Skritter mobile app. It took me only a month to commit about 300 characters to memory with 85% accuracy (incidentally, mostly the same 300 characters I had completely forgotten when I learned them 10 years ago)
I would recommend up front, to spend more time practicing speaking and listening, and not obsess too much about the characters, although you will eventually need to learn them. I learned the opposite way and I regret it, in the sense that I spent a lot of time learning to read Chinese, especially vocabulary through Pinyin, and while I know about 1000 words now, enough for basic conversation, my listening skills are horrible. I can speak, but I don't recognize what people are saying in response because I mentally spend too much time trying to assemble the phonemes and translate them via "slow brain"
The best thing you can do to learn honestly, is to join a group, or find native speakers to practice with. Today it's easier than ever with online chat through HelloChinese or Verbling, or other mechanisms over Skype or Hangouts. I found lots of local groups on meetup.com where native speakers help learners practice Chinese and in return, native English speakers help them with their English. Something about face-to-face practice helps burn things into your mental muscle memory more than independent study via reading.
I started out deciding I would never bother learning the Chinese characters and only focus on speech and listening.
But eventually I hit a point where I started to get a strong desire to learn them. I got really curious about them, and wanted to be able to read signs and menus. So I spent a year getting my reading skills up to about the same level as my listening/speaking.
I think that was a nice way to do it. Learning to read helps with listening/speaking skills too, but it's way too daunting to try to learn chinese characters until you can actually speak.
Thanks for the tip for the app. I used Memrise myself.
For listening I recently discovered PopChinese (http://popupchinese.com/) which provide free podcasts. I like it because even beginner level dialogs are spoken at real speed. And I see the fact they only provide transcripts for paying subscribers as a feature: it forces me to concentrate on actual listening until it get it all.
There are many reasons why Chinese is difficult for English speakers to learn.
(1) There are many thousands of characters, so you're effectively learning two languages at once: spoken Mandarin and written Chinese. Reading Chineasy might help you get started won't help much after that. Alphabetic languages have at most a few dozen characters.
(2) There are no cognates. German and Swedish, for example, have lots of cognates recognizable to English speakers.
(3) There are very few loan words. There are lots of English words in Japanese, and lots of Japanese words in English.
(4) There's a huge cultural gap.
(5) There are different languages all called Chinese. Learning Putonghua won't help you much with pre-revolutionary (1911) texts, as they were written in Classical Chinese.
(6) Unless you go to China, the Chinese people you meet will usually speak better English than you will Mandarin, so conversations will lapse into English.
On the plus side: The tones are the least of your problems, and pronouncing Chinese isn't much harder than many other languages. Chinese grammar is dead simple. It has a fixed word order, which is similar to English, and there's no inflexion. I'm also told that once you know a certain number of characters (still several thousand), you can easily guess and infer unfamiliar words. The Practical Chinese Reader series stops using English altogether once you reach Book 5. (It's been superseded by more modern texts.)
I've been learning Mandarin for a long time. I can understand a decent amount and I can speak pretty well, but I can't read much or write worth a damn, and my vocabulary is small. I also speak French, and learning to read and write took basically no time at all (just quickly learn some pretty regular rules, much simpler than English) and a lot of vocabulary carried over.
So in short, yeah, spot on.
I do think tones are a bigger problem than you let on. You can get by without them. I've met people who are pretty fluent except they just have no command of tones, and they can still understand and be understood for the most part. But the intimidation factor is huge, and even if you can be understood, people want to speak well not just adequately.
I think the language would be easier to learn if we'd punt on reading and writing for the first year or two. Every curriculum I've seen incorporates writing heavily from the start. As you note, it's like learning two languages at once. That's a lot of extra effort to put in! I imagine it would be much easier to learn these effectively-two-different-languages serially rather than in parallel. Attain basic proficiency with speaking and listening, then move on to how to read and write what's already known.
I don't think a learner should skimp on reading, but I think its worth asking if so much time should be spent learning to write Chinese. It's a lot of rote practice that can't be sped up, but in China, writing has been replaced by typing (on computers and cellphones). Typing Chinese using any modern pinyin IME is a reading exercise.
I'm learning Mandarin right now and I love the grammar. I studied Spanish in high school and the lack of conjugation in Chinese is superb. I even feel sorry for people learning English now.
我是很酷。
(Wǒ shì hěn kù.)
I am cool.
你是很酷。
(Nǐ shì hěn kù.)
You are cool.
他是很酷。
(Tā shì hěn kù.)
He is cool.
她是很酷。
(Tā shì hěn kù.)
She is cool.
我们是很酷。
(Wǒmen shì hěn kù.)
We are cool.
他们是很酷。
(Tāmen shì hěn kù.)
They are cool.
That 是 (shì) stays consistent through all the sentences. Love it. Also the fact that 他,她,and 它 (he, she, and it) all have the same pronunciation (tā).
Actually, you don't even need to use a verb here. 我很酷(Wǒ hěn kù)is enough.
This is probably something you didn't learn yet.
You should think of it as a sort of tarzan-talk: "Me very cool!".
If you use 是, you actually put on emphasis on the sentence. Like: "You're not cool.", "I AM cool!".
That (nice!) article gives 9 reasons. Only 6 of them are applicable in 2015.
#3 is now irrelevant because people don't write with pens. They type on their phones or, when at work, on their laptops.
#5 is no longer true, as dictionaries are now on Pleco. Yes, I found it hard to look up words in a paper dictionary when I started learning. But, for today's learners, their first dictionary is Pleco on iOS/Android, and this pain is gone.
#7 is a weird one. You only need one romanization method (Pinyin) to learn Chinese, or to look up the standard (Mandarin) pronunciation of a new character. When would I ever need to use or read Wade-Giles, or any other romanization method, unless I'm reading the romanized version of a Taiwan/HK person's surname?
>#3 is now irrelevant because people don't write with pens. They type on their phones or, when at work, on their laptops.
First of all, this isn't true. Every convenience store in China sells pens. Billions of pens are sold per year in China. Even at the tech company where I worked in China, everyone had a pen or pencil and some wrote notes with them. People often used markers and wrote things on whiteboards, too. It was pretty much the same in that sense as English speaking places.
Secondly, the main point in #3 wasn't about writing. It was about learning. If you see a few taquerias in Mexico with "Taqueria" written in the sign, you'll learn a new word and how to pronounce it. You might later realize you've been hearing the word "taqueria" in a radio commercial and suddenly understand more of it. In China, you could enter a post office, see 郵局 written on it and hear the word 郵局 in conversation every day for a month and still not understand what it is people are saying. That is why the writing system slows learners down, even if comprehension is their goal.
You're right, especially about the reinforcement bit.
There are some characters that I can recognise, because I've seen them often enough on Taobao, but don't come up in every day conversation. So, I don't know how to pronounce them, and wouldn't recognise them if someone said them out loud.
Wade-Giles is used very, very widely in English-language historical writing. Particularly anything that is more than about twenty years old. So you have to know what the names of people and places are in both Pinyin and Wade-Giles, and often enough in whatever less-standard romanization the author happened to use.
People write with pens often. Although characters per second is much faster on a computer, there's less total overhead to writing with pen and paper in many situations.
Eh, the complaints about the characters seem mostly overblown. I think I picked up about 2,000 in a year, and that was not studying particularly hard. Past a certain point most of the new characters are combinations of characters you've already studied. Of course, if you're not practicing your reading it's .
Also, if you have at least a decent level in the language it's not going to be unfamiliar characters that give you trouble in newspaper articles (most of the unfamiliar characters you run into will be from names or chengyu), but Chinese words your unfamiliar with. And in these cases, the characters actually help; you're going to have a much easier time guessing what "jacuzzi" or "phlebitis" mean in Chinese than in English.
Relatedly, people might find Tan Huey Peng's "Fun with Chinese Characters" series of books enjoyable. Each entry offers an archaic writing, and its current form, outlining the origins of each, as well as the stroke ordering.
Sadly, it seems the series fell out of print, but it's available on iBooks (but not Kindle, apparently):
> The last part I interpret as something like: “the mainstream idea of shared production”, in other words, communism.
This is something I love about Finnish as well. I don't know enough Finnish to say if it's a general pattern, but I've noticed that a comet is a "tail-star", a toad is a "crust-frog", the world is the "land-air" and there's a couple of other examples I've collected over the years. I'd love to learn that language.
Strokes aren't important only for muscle memory and how to remember the characters--the order also helps with symmetry and the overall shape/size of the finished character.
Another fun part of Chinese is that most characters are monosyllabic. This makes numbers easy to remember and, on average, Chinese speakers can remember a few more random digits than English speakers can.
A rosy picture inevitably includes a number of thorns...
I'm learning Chinese, and am having a wonderful time doing so. But there's no doubt that there's a great deal of historical baggage with the language, which is bewildering and exhilarating at the same time.
The bottom line is that even for a native speaker, if you encounter a new character, chances are you don't know what it means or how to pronounce it. If it's simple enough, you may guess, but that's it.
Learning Chinese means memorizing characters. A lot of them. It takes over a decade to teach a Chinese kid to even read a simple magazine.
Combine this with the use of tones, and you get a pretty complicated mess. You can't use tones to communicate things like irony, anger or doubt. You have to play on volume. You're losing a lot of expressiveness and subtlety in the process.
This is why building a search engine in Chinese is so hard. You can't apply a Porter stemming algorithm for example; there's no concept of stem or etymology (no apparently logical one at least).
Moreover, combining one character before (or after) another can change the meaning entirely. Adding a third one can again change things. Since you don't separate words with spaces in Chinese, good luck figuring things out programmatically. Things in Chinese are heavily relying on context.
Learning Chinese can be an interesting challenge, but understand that it will mostly rely on memorizing characters with little to no logic or trick to help you do so.