They clearly ignored that rule for the name of the language =)
On a more serious note, this is rather exciting. It would make a very good candidate for a universal language. A great deal of time and effort went into making this language. Hopefully this isn't the last time I hear of Lojban.
If you like this, you should read In The Land of Invented Languages by Arika Okrent. Aside from the being the best non-fiction book I've ever read, it talks about Lojban and characteristics of invented languages that make them suitable for becoming a universal language. It turns out, Lojban would make a terrible universal language for precisely the reason I pointed out in my other post. It's too complicated. When we speak, we frequently start a sentence not knowing how it's going to end. We pause and insert filler words to give us time to form our thoughts. We use ambiguity because sometimes we just don't know precisely what we're trying to say (Lojban has, if I recall correctly from Okrent's book, over 30 ways to say 'and'). Forcing people to have a complete understanding of what they want to say before saying it won't make them more clear when they talk, it'll make them not talk.
> When we speak, we frequently start a sentence not knowing how it's going to end.
You can start and continue with a Lojban sentence indefinitely via various means. Metalinguistic markers such as sei ... se'u allows a discursive (on-the-fly) predicate or sentence. You can insert parenthetical notes with to ... toi into anywhere. You can insert the attitudinals (used to express attitudes, emotions, evidentiality, etc.) into anywhere. A construct called tanru allows an endless sequence of predicates (whose form in Lojban do not alter according to the natlang parts of speech such as adjective or adverb, thus imposing less restrictions on the way you keep forming a sentence than natlangs such as English does). With li'o you can omit any quantity of text you don't want in your expressions. With si/sa/su you can 'erase' various mistakes in your utterance. And so forth.
> We pause and insert filler words to give us time to form our thoughts.
That's what y is for in Lojban.
> We use ambiguity because sometimes we just don't know precisely what we're trying to say
The unambiguity of Lojban is mostly syntactic, not always semantic. You can be semantically ambiguous in Lojban with e.g. tanru.
> (Lojban has, if I recall correctly from Okrent's book, over 30 ways to say 'and').
Which includes ju'e, a vague connective for "and".
> Forcing people to have a complete understanding of what they want to say before saying it
Syntactic ambiguity is half the point. I still have to decide whether or not I want to use the vague/all-purpose connective for "and." If I want to add parenthetical information, I have to realize I'm doing that and indicate it. If I want to "erase" a mistake, I have to recognize that that's what I want to do and use the proper word. I still have to know exactly what I intend to say. When I'm speaking, I don't think to myself, "Ok, this is parenthetical... this is correcting an error... this is vague and all-purpose." I just speak.
In Python, whether or not a method is private is defined basically by whether or not you call it from another class. In Java, you have to be explicit about it. You CAN just declare everything public (be vague and all-purpose), but aside from being frowned upon, you still have to conciously decide to do this. If you don't care about the protection of scope, you can make everything public, but you're still specifying scope. In Python, you don't specify scope at all. Even if you wanted to, you can't.
The same thing shows up in type checking. If you want, you just call everything an object in Java. It's vague and all-purpose, but you still have to specify a type. You still can't write 'a = 4'. You still have to write 'Object a = new Integer(4);', which it's tough to argue is simpler just because it isn't type-checked.
Explicit ambiguity is not much closer than explicit disambiguity to implicit ambiguity.
I realize at this point, I'm getting beyond my own knowledge of Lojban, but I just want to make the general point that being able to be ambiguous does not automatically afford the advantages of the natural ambiguity in "natlangs."
> I still have to decide whether or not I want to use the vague/all-purpose connective for "and."
If you are undetermined, you use the undetermined option, the vague one. You don't really make a decision for that.
> If I want to add parenthetical information, I have to realize I'm doing that and indicate it. If I want to "erase" a mistake, I have to recognize that that's what I want to do and use the proper word. I still have to know exactly what I intend to say.
Planning is not required for adding parenthetical information. You use it on the spot where you happen to want it.
If you don't recognize that you want to erase a mistake, you just don't use the erasers.
> When I'm speaking, I don't think to myself, "Ok, this is parenthetical... this is correcting an error... this is vague and all-purpose." I just speak.
I'm not a native English speaker. When I started learning and speaking in English, I would think to myself, "Ok, the word "which", when used after a slight pause, which corresponds to the comma in writing, it means a non-restrictive relative pronoun, which is what I want now, so I'm going to use it that way". As I kept practicing the language, I internalized the rule, becoming less and less actively conscious of it. Lojban is no exception. I have already internalized some parts of the grammar to which I used to pay much attention in my earlier period of learning.
Also important to note is that, what English expresses with non-verbal properties such as intonation, Lojban can do verbally. When you orally say something in English which you would put in a parenthesis in writing, your speech act are still subject to some phonetic principles such as inserting a pause, changing the rhythm, lowering the pitch, using less breath, and so on. You consciously or unconsciously have to be in command of these properties if you are to successfully deliver your utterance. A particular combination of these non-verbal phonetic properties is what Lojban expresses with to ... toi, the parenthetical markers. Whether or not you are actively conscious of them is a matter of internalization. Just like an experienced English speaker not always actively thinks about each noun's grammatical number in their utterance but still manages to add the plural marker "s" where appropriate, an experienced Lojban speaker would be able to correctly start a parenthetical note with "to" but without always actively checking whether or not that's what they really want to say.
> I realize at this point, I'm getting beyond my own knowledge of Lojban, but I just want to make the general point that being able to be ambiguous does not automatically afford the advantages of the natural ambiguity in "natlangs."
Could you give me an English example with such advantages?
I agree that would disqualify it as a universal language candidate. At the same time though, how often do you wish you'd thought out your next sentence in its entirety when speaking to your girlfriend/boss/mother/etc ?
I'm a rather thoughtful speaker, so I rarely regret my words, but this is a somewhat separate issue. There are lots of factors that affect how likely someone is to say something they will later regret. The biggest thing that makes people say things they'll regret isn't a language, it's the internet. If you say something you might regret in a letter, it's pretty easy to catch it before sending it. In an email, not so easy. In an IM, you're likely to send the message before you've even read it back.
What this tells me isn't that we need better tools (whether they be software or languages or anything else), it's that sometimes people say stupid things, and we need to be more understanding. A hundred years ago, it may have been reasonable to expect that a long-distance friend would never say anything rude to you, because they'd have time to edit their words. It isn't reasonable to expect that anymore. People haven't changed, but their ability to edit themselves has. Instead of trying to recreate that ability to edit ourselves, we need to evolve our interactions with each other so that these thoughts that we had time to edit before don't play such a big role in our communication.
In Lojban,the letters which appear in the word Lojban are pronounced thusly:
L as in Logger;
O as in wOn't;
J as in aZure;
B as in Boy;
A as in wAter;
N as in Nice;
Lojban is therefore pronounced LõZH-ban, with stress on "LõZH".
"Lojban" is a cmene (name) derived from the lujvo (compound word) "lojbau", a conjunction of "lojbo bangu", a tanru meaning "logical type-of language".
"Lojban" is not ambiguous as to spelling, being phonetic (unlike "phonetic") and also non-ambiguous in meaning, since the name for the logical language Lojban means "logical language".
> Lojban: "spelling is phonetic and unambiguous"
> They clearly ignored that rule for the name of the language =)
I don't understand your point. The name of the language follows precisely the same rules of spelling and pronunciation. Can you expand on why you think otherwise?
Although a few people want a Universal Language, the Logical Language Group which administers the Lojban language standard has never sought that as the purpose of Lojban. It is a thought experiment.
What they don't tell you is that Lojban is so complicated no one can speak it. Go to the forums and pick almost any thread that contains Lojban, and you'll find debating about the language. If the supporters can't even speak it right, what hope if there for those who just want to communicate?
My experience suggests that there are a handful of people who can read, write and speak the language fairly well, using it, when they choose, as their primary means of communication.
Most of the discussion falls into two types.
The first type is when a newbie tries to transliterate into lojban, and often says something they don't mean. This is common with anyone learning another language, it's just that in lojban one can be much more precise about what was wrong, because there's a formal grammar. Common confusions are mixing "verbs" with "nouns", neither of which have an exact matching concept in lojban, hence for some it's tough to get a grip on what's going on.
The second is debating the semantics. What does this word, explicitly created for lojban, actually mean? What's the difference between a pot and a jar? How, in English, do we decide which word to use? What is the defining characteristic of each?
These are often held up as criticisms of the language, but I think unfairly so. They provoke debates in epistomolgy, semantics and pragmatics, and touch on many of the questions that will need to be answered by the machine translation, semantic web and artifical intelligence challenges.
And what's the point in learning a language that's just a re-coding of your native language?
The other undeniable short-coming is that there is no underlying "culture." Much of the interest in learning, say, Russian, comes from the cultural insights. There is none of that in lojban.
Lojban isn't ready for generic, general purpose communication. It's still growing. If all you want is a tool, don't go there.
If you want mind-expanding concepts and debate, have a look.
There are many people who can converse fairly well in real-time written Lojban (although vocabulary is limited, and some parts of the language are generally not used).
Due to audio-visual isomorphism, being able to converse in written Lojban almost implies being able (theoretically) to converse in spoken Lojban. However, people are generally much less used to speaking and hearing Lojban, so spoken conversations are usually somewhat difficult (definitely far from fluent).
Right. It's been around since 1955 (in spirit at least, in the form of Loglan) and the best anyone can muster is to converse "fairly well" in "real-time" but not in spoken conversation. In other words, it's a language no one can speak. Klingon speakers are even more fluent than Lojban speakers.
I'm not trying to say Lojban is stupid. It's not. It's cool. It fills an interesting niche in the language world. However, I'm not convinced it's speakable. Remember the quote by Kernighan: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." I would argue that speaking a language is at least twice as hard as inventing it or writing it in a non-real-time fashion, so if you create the most complicated language you can still manage to write (however laboriously), you are by definition not smart enough to speak it. Lojban is that language.
Lojban is not designed to be complicated. The important parts of the BNF grammar are about a page, and easily understood in an hour or so of properly directed study.
The hard parts are the vocabulary and the mind-set. It is specifically designed to be different, but even then, there are sufficient similarities that it doesn't have to feel completely alien. Some people present it as such because they think it will attract people, but I know that sometimes familiarity is a better draw.
It can be presented either way.
And people tend to say uncomplicated things in Klingon, whereas most lojban speakers are exploring saying very complicated things - you're not comparing like with like. Further, although gatherings of Klingon speakers are fairly common - piggy-backing on Star Trek conventions - gatherings of lojban speakers are rare. Even so, there are several people who speak it fairly fluently, and the number is growing (in some sense).
It is often criticised (I'm not saying you are doing so) for being something other than what people think it "ought" to be. This is a marketing issue. I think it's interesting, and, like learning Lisp, it has expanded my mind in interesting ways.
Lojban may not be designed to be complicated, but it's designed to be complicated. By that I mean, while the goal might not be "make it complicated," the goals necessitate that it will in fact be complicated. There is no way to remove all ambiguity from language without making it exceedingly complicated.
This is a persistent, incorrect interpretation of what people mean when they say that lojban is intended to be unambiguous.
Parsing lojban is unambiguous. Interpreting lojban quite specifically is not unambiguous.
Given a grammatically correct construction there is a unique parse. That is what is meant by "unambiguous". There's no problems akin to "Machines need to wreck a nice beach."
There are manifold ambiguities, however, in the semantics. When one talks of "lo sutra tavla" there is no indication as to the sense in which the speaker is fast. Perhaps the speaker produces many words per minute, or perhaps the speaker runs past while talking. These ambiguities can be reduced by using more precise expressions. Metaphorical use is frowned upon, so what it does not mean is one who persuades in a fraudulent manner.
For example, we can say "lo gerku" which refers to a dog, or some dogs, but gives no idea of how many. We could say "re le ci gerku", which means "two of the three dogs." More precise.
I can say "mi tavla", which means "I speak" or "I will speak" or "I have spoken" and even leaves the audience, topic and language unspecified. I can say "mi ba tavla" or "mi pu tavla" which are future and past respectively. I can be even more precise if necessary or desirable.
... my understanding based solely on reading ABOUT
Lojban, I don't know a single Lojban word... aside
from Lojban
I think you have fallen into the trap of not reading enough, and mis-interpreting some of what you have read.
I'm not surprised, I think much of the early material written about lojban was written without regard for how it might be mis-interpreted. Politicians today generally say nothing, because everything they do say runs the risk of being taken other than intended. The early lojban writers (writing about lojban, not necessarily in lojban) needed "spin doctors" to ensure that what they said could not be mis-interpreted.
All that aside, lojban is intended to be an expressive language, suitable for communication. Therefore it will be complex, although the complexities are not necessarily those of natlangs. I suspect that we are not that far apart. We agree that:
- lojban is complex
- lojban is not currently suited for general use
- lojban is cool
I further believe that:
- learning lojban (at least beyond "mi tavla") is mind-expanding
- learning the structures of lojban teach more than just lojban, they teach about structure, syntax, and monolinguistic assumptions.
> There are no languages that are just re-coding
> of other languages.
That's true, but it's "less true" of Italian/Spanish or Russian/Bulgarian than it is of, say, Russian/Swedish. I suspect that we are in complete agreement, and might argue only about precision of expression in our opinions.
And I didn't say lojban is unsuitable for communication. There are a few people who use it exclusively when communicating with each other. It is not yet ready for generic, general-purpose communication, just as the newest computer programming language isn't ready for mainstream use by those not interesting in the language so much as getting things done.
But again, I suspect we are/would be in "violent agreement" about most of these points. I contend that lojban is, regardless, mind-expandingly interesting.
How complex is "slightly non-trivial" ? Hmm. Here's a recent example:
le se viska be mi cu xamgu
Translation:
The thing(s) I (or we) see is/are/were/will be good.
Probably from context, more colloquially:
What I see is good.
Be advised: not only am I not an expert, I class as a newbie with dangerous knowledge. I have read, understood and worked with the formal machine grammar, and I have written some tools to work with parsed utterances. I have not, and probably never will, have any command of the vocabulary.
I'm happy to try to answer questions, and will defer those that are beyond me to one of the lists on which I lurk.
Thanks for the sentence, that's certainly nontrivial enough (I was just trying to dodge 2-3-word-sentences like "jesus wept" or "i like milk", which rarely show anything interesting at the grammatical level).
Not-mocking: there's an apparent lack of number ("thing(s)", "I (or we)"). Is it closer to true that:
- the grammatical root of this apparent lack of # is shared between "thing(s)" and "i (or we)"
...or that:
- "thing(s)" is a gloss of some word that's ~ "specific things not specifically specified"
- "I (or we)" is a gloss of some word that's ~ "who I speak for" (or some other deictic term that's ~ first-person but otherwise underspecified)
...or to some other possibility? Additionally: Given that it's a designed language I'm curious about what underlying intent (whatever it is that explains why the answer is what it is, instead of being something else).
(Disclaimer: wiki's lojban articles have resolved a lot of my other questions, but before I asked you I'd only looked at lojban's wiki's articles, which are mostly unhelpful.)
Short answer: the latter is more true. The reason why {mi} is unmarked for number is simply that it’s defined that way.
In Lojban, everything is unmarked for number by default. It’s actually quite rare to see things explicitly marked for number, as it’s usually either irrelevant or obvious from context.
The pronoun {mi} is technically unmarked for number, but is restricted to refer to people that the speaker represents, just as you guessed. For example, it would usually be weird or incorrect to use {mi} to mean “we” in the sense of “me and you”, since representing the very people you are talking to is a rare situation — although theoretically you could come up with examples where it would make sense.
So in practice {mi} is usually singular. On the other hand, {do} (which means “you”) is as often plural as it is singular.
There are other pronouns that mean “me and you”, “me and others”, “you and others”, and “me, you, and others” — respectively, {mi’o}, {mi’a}, {do’o} and {ma’a} — which is another reason why the need for plural {mi} seldom arises.
Thanks. I figured grammatical # as such would be discarded as unnecessary but the notion of plurality raises semantic issues that natural language sidesteps by ambiguity.
Disclaimer/Personal Background (trying to be brief): I've often been told I've got an unusual cognitive style (for lack of better term) and I've often felt very much as if there's an impedance mismatch between how my thoughts are structured and how language operates; in essence, at the word-or-sentence level everything I hear or read is very polyvalent and vague, and only take on a concrete meaning to me if I get multiple paraphrases of it...it's putting all the variants into superposition and seeing which parts reinforce or cancel shows me the contour of the actual meaning (which itself is not necessarily ever actually "represented in words" so much as "gets the outline of its semantic boundaries painted").
In the abstract this leaves me with an interest in the idea of something like lojban but very mixed initial reactions: it's possible an artificial language with more-precise meanings would eliminate my need for doing verbal interferometry across multiple paraphrases but on the other
hand I have a lifetime's experience feeling very uncomfortable without tons of redundancy and repetition-with-alteration, which seems to be what lojban is trying to eliminate in its use.
Too much info, I'll stop there.
I do have two more questions if you have time.
#1 is historical: what's the process by which the core sets of things like spatial relationships or tenses or shapes or so on came to be enumerated?
EG: if I were doing a language in this form I'd go through all the languages I could get my hands on and try to get good lists of all their fundamental categories (eg: spatial prepositions and "classifiers", like you have in swahili and chinese (+ languages with heavy chinese contact) (cf: http://www.jstor.org/pss/413103 ) and then try to factor them into semantic atoms. I'd consider this approach bottom-up (see what's out there, and then try and simplify and unify them) and contrast it with a more top-down approach (trying to derive a finite set of spatial relations ab initio via pure reasoning); it'd also be a good set of "unit tests" for your final set of core concepts, making sure that none of these words' senses are not really expressible in terms of your base concepts.
How did the lojbanists derive their tenses / spatiotemporal prepositions / etc.? Is there a good "history of the design of lojban" that speaks to this?
Question #2: at a practical level how would you decompose "There are dogs in the kitchen" into lojban?
If I had to break it into predicates it'd probably be the conjunction:
- T ~ whatever containment type you have that is ~ "contains within its spatial bounds -- but not structurally -- for an indeterminate time period"
- COUNT(E) > 0
- ENTITY-COLLECTION-TYPE(E,X), where X ~ "collection treated as collection due to spatiotemporal circumstance and descriptive convenience" (EG: E is an entity collection b/c there are label(s) they all share, namely being instance-of dog and contained-in-the-kitchen in the same way; there's no assertion of any other source of entity-identity beyond the circumstances this utterance is describing; contrast to say "baseball team" or "deck of cards", etc., which are entity-collections with a more-persistent and "intentional" identity)
- forall e in E IS-INSTANCE-OF("instance-of-type IoT",e,"dog")
- "instance-of-type" ~ whatever instance-of you have that is ~ "is a concrete instantiation of an abstract type not otherwise specified (eg: an actual 'dog', not 'Pomeranian')
- + some temporal modifier to explain like "the described circumstance started before I made this utterance and I do not think it has ceased, yet"
...but I'd assume some of the intended distinctions are usually left implicit or inferred; what's a good lojban decomposition?
Something similar was done, but it was explicitly recognised that the purpose of lojban was not to generate "the semantic primes of language." Such as exercise is regarded by some linguists as meaningless, and by others as too difficult. Instead, concepts were listed, and from them a "covering set" was extracted. Similarly tenses, both spatial and temporal.
After the concepts were agreed, it was expressed in each of the (then) six major world languages. The words thus obtained were put through a weighting algorith,=m to try to find a "word" that had components of each, and that became the lojban word for that concept.
Thanks for the response, I do appreciate it. I wasn't aware of how active lojban still is (and it's much more accessible to to get information on thanks to the internet).
I should point out that I'm fairly familiar with the general range of opinion in the linguistics community (as an undergrad I did dual math / linguistics, which made me at that time quite the rara avis, though it's more common now apparently).
Generally I don't give much credence to the idea of semantic primes (at all, not just in some pragmatic sense) but for stuff like spatial relationships + tenses (+ aspect, mood, etc.) it'd seem not an impossible undertaking (do enough reading in linguistic typology and you start seeing enough "repeats" to think such an enumeration might be possible).
After going through a bit of the grammar and the vocab list on wiktionary it seems like you'd have constant problems with synecdoche, which'd bother me (but perhaps only me, and it's not as though natural languages aren't riddled with similar problems).
I've walked away from this with a much stronger sense of the sense in which lojban is attempting to be a logical language, thanks for your time.
It's heartening to see substantial effort put into engineering language; good luck with your efforts.
In lojban, as in Chinese (I believe both Mandarin and Cantonese, although I'm not an expert), number is not implicit and is usually either irrelevant or determined by context. There are mechanisms for specifying number. Ditto tense. Things can be left even more fuzzy, or made more precise.
I think the easiest way to sum up Lojban is this (my understanding based solely on reading ABOUT Lojban, I don't know a single Lojban word... aside from Lojban):
Anything that can be said in any language can be said in Lojban. Anything that can be left unsaid in any language can be left unsaid in Lojban.
Thus while it is not possible to use the verb "to be" in English without expressing a tense, it is in some languages, therefore it is in Lojban. Presumably it is also possible to specifiy a tense when saying "to be" in Lojban. It can be done in English, therefore it can be done in Lojban.
Consider a reasonably-popular, useful software package. At first, one might just toy around with it without sufficiently reading the documentation. Before long, one will learn to use it correctly, but not before getting a lot of error messages. That does not mean the software is impossible to use.
Lojban is similar, except that the only place you get error messages in Lojban is from other users. What you are seeing is a symptom of how active the Lojban community is, especially since being featured in an xkcd comic. Many newbies are attempting it before learning it. You are seeing a process of welcoming and educating newcomers that has become the norm in our community.
Consider the hypothetical software package again. You may wish to do something weird with it that is at the edges of its scope. You may need to make a feature request, or report flaky performance at a level for which the software is rarely used.
That is what you are seeing as well. Our debugging committee, the BPFK, is hashing out a lot of edge cases. However, these edge cases are tangential to the day-to-day usage of almost anyone, which is bug-free.
As a speaker of lojban, I have to speak out. Lojban is not Ithkuil. Lojban isn't that complicated to speak, and most of the times on the forums, people are debating better ways to express sentences, not just because they're incorrect. Also, part of the lojbanic culture is to nag people if their grammar is incorrect. In English, most of the time, people may cringe or roll their eyes if they see someone saying something incorrectly, but it's considered polite in Lojban to correct someone's grammar. We also have a lot of beginners and new learners, so don't be too hard on them - they're just learning. There's a common misconception that logical = complicated, which is entirely not true. Lojban is in many places much simpler than English, and makes a lot more sense. I'll admit, there are a couple things which are more complicated, but lojban was built to be spoken by humans, and you don't have to be a genius to speak it. In short, it's logical - but not complicated!
They clearly ignored that rule for the name of the language =)
On a more serious note, this is rather exciting. It would make a very good candidate for a universal language. A great deal of time and effort went into making this language. Hopefully this isn't the last time I hear of Lojban.