It's interesting that the Arabic TLDs aren't abbreviated the way Latin and semi-Latin (.рф) country codes are. I wonder if that's reflective of anything besides that abbreviation of words is not as common in written Arabic.
One thing about abbreviations in Arabic is they will be read as a word, not as distinct letters. In English, "http" can't be read as a word, you always read it as "ech-tee-tee-pee". "etc" is not read "etk" but read as "ee-tee-see". Where as in arabic, "الخ" is read as one word "ilkh" instead of as letters "alif-lam-kha".
So, to abbreviate words, you have to think hard about how people will read it. For instance, someone mentioned in another reply "Hamas" and "Fatah". Both these abbreviations read nicely as other words with related meanings. But, how would one abbreviate "Filisteen"?
If you choose the first three letters "Filis", well, that reads like "penny" and is generally used to refer to cheap things - you don't want to use that.
If you use "ft", that would read as "FiTT" فط which just sounds awful in Arabic. (it's not like "fit", the "t" here is heavy/strong).
Actually, "filisteen" is already bad enough: it sounds like "filis" + "TTeen" which is "penny" + "dirt" (we used to make fun of this when we were kids). Though in Palestinian (and the Levant region in general) they pronounce it "falasteen", not "filisteen".
It's pronounced "et cetera" because that's what it's an abbreviation of. Most abbreviations (unlike acronyms) aren't verbalized in their abbreviated form. You say "versus", not "vs" when you read it.
However, I do often hear Unix hackers say "et-sey" when referring to the "etc/" directory because precision is important there: the directory is not named "etcetera" so pronouncing it as such would confuse.
It must depend on your definition of "abbreviation". For example, Hamas and Fatah (the main political parties in Palestine) are both acronyms. Fatah is actually a reverse acronym.
We're defending the rights of the oppressed minority against the Jews, and that oppressed minority that unites under the party "conquest" ? Seriously ?
If this is your name, how the hell would you defend the idea that palestinians are just trying to live independantly ?
I think that's not strictly true, what about الخ meaning الى اّخره (i.e. to the end of it, meaning "etc".). In some books ح can mean حديث. But generally it is very rare.
Arabic hasn't "modernised" in quite the same way many western languages have, which makes it quite interesting. Since it has stayed a written-only language for quite some time, the letters were not broken up, and its joined-up form has stayed into the digital age (at least, I assume this is why)
I don't mean just abbreviation, I mean Arabic hasn't seen certain types of changes that have occurred in Western languages. Although, arguably, Latin has always been fundamentally a non-cursive script, and Arabic has always been cursive, perhaps.
Ah, I see. I think you may have it backwards, though: script Arabic developed around the same time as lower-case letters in Greek and Latin (very roughly 0CE), coinciding one presumes with growing use of paper for writing. While Latin eventually incorporated its older "upper-case" forms with a minor grammatical function, Arabic discarded them altogether as antiquated. In that sense, one might well make the case that Arabic is in fact the more "modern" alphabet.
You're quite correct. The old 'Latin' script (our uppercase letters) where revived in the renaissance - and were slowly merged with the Hunnish script (our lower case letters) - about the same time as they adopted Arabic numerals (read right to left, for added confusion).
Other alphabets (Greek and Cyrillic) created capital letters in emulation much later (circa 1800's). Most alphabets/sylabaries (Arabic, Hebrew, Korean, Japanese, etc) don't have upper and lower cases.
Likewise punctuation, which prosidic languages (like English) need to signal stress and emphasis which can transform meaning, but which are also absent from lots of other writing systems.
I think talk of 'more modern' or 'less modern' is nonsense though. Writing is an imperfect representation of speaking, and whatever works is just fine.
I don't think the timeline for the development of the Arabic script is quite right. I believe it's more typically put 3-400 CE, with the Nabatean script slowly accruing Arabic-like features over time.
Of course, my knowledge of the history of writing extends to a single undergrad course and a quick skim of Wikipedia as a refresher. I'll happily admit I'm wrong if that turns out to be the case.
I didn't mean to imply modern Arabic script came to be then -- it wasn't what we'd call Arabic today before the Quran -- only that evolution from inscribed block capitals to written script was happening at around the same time as that transition in other languages.
Anyway, in this context I'm counting 300BCE - 300CE as "very roughly 0CE".
Even so it still seems apparent that modern Latin alphabets have been adapted for typesetting, whereas even contemporary written Arabic seems to remain optimized for handwriting. It seems reading Arabic is a lot more difficult than reading Latin based writing, language issues aside, simply becuase the letters are all joined together.
I don't see how cursive script makes Arabic less “modern.” It might make carving it onto stone a bit harder, but the printing press, the typewriter, and the computer all had no problems producing quality Arabic type.
Having written code which renders Arabic text, I found “joining” up the letters to be quite simple. It's just a few rules to choose which glyph to display for each character depending on context. The tricky bits, I found, is in integrating a right-to-left script with systems which were made with only left-to-right in mind.
Ever tried dealing with vertical column text (e.g. Japanese?). I haven't myself, but the fact Microsoft made fonts on Japanese versions of Windows have special "@"-prefixed versions rotated 90 degrees, so if a document is written in it, then changed to that font and rotated, it's the right way up, makes me suspect it's pretty difficult.
That sounds like the result of a lack of forethought when designing the font rendering code, and not because it's a big problem. I've used vertical Japanese text input extensively in Word without any problems.
mixing RTL and LTR is a pain in unix-like tools (gnu, os x).
What I usually (not always) found out diring my experiments woth Persian is that the gnu tools themselves deal with it quite fine - but the terminal application shows them at pretty random places. If there is both LTR and RTL in one file, havoc usually ensues.
(persian also has this funny "invisible space", but it causes only small annoyances)
I digress, but I think this is one of the most annoying things about Unix these days. I practically live by the command line; but terminals are terrible. We're basically emulating technology that was already getting obsolete in the 70s!
Plan 9 had the courage to shed this cruft by having simple text windows which have a prompt. No curser addressing and no crazy control codes. That makes rendering it just as easy (and beautiful) as rendering text box contents. I wish the Linux desktop environments would follow suit.
My impression is that all the major browsers show these links as punycode for security reasons. What is the point of these then? Is punycode defaulted to "off" in some countries? Wouldn't that just create two classes of users, one more vulnerable to phishing than the other?
It looks like if you add Arabic as a language in Chrome, domains aren't converted. (And actually it will change http://xn--ggblala6cyf.xn--wgbh1c/ to ستفتاء.مصر in the address bar.)
That is not correct. For example, Mozilla takes a white list approach where they add TLDs that have registration policies that limit the risk of phishing (which is most of them).
Logically, one would think your native language's script would automatically get "whitelisted" and rendered properly, while everything else is punycoded as usual.
Even the TLDs aren't actually in the native script, they are punycoded in the spec. https://en.wikipedia.org/wiki/Internationalized_domain_name#... So the new TLD of فلسطين is really .xn--ygbi2ammx and both inputs will work, you can just use whichever is easier for you to type.
The West Bank, East Jerusalem, the Gaza Strip and the Golan Heights are all considered occupied territories (by most of the world at least. The State of Israel calls them ``Disputed''. I'm an Israeli citizen, and I must say I disagree with the official stance of my government.)
"Occupied territories" aren't a country though. Otherwise the domain should be related to Egypt or Jordan (whose territories are supposedly "occupied").
The Gaza Strip, West Bank and East Jerusalem are not self ruled, no.
The term "occupied" relates to the fact that they are governed by the Israeli army, and are not self governed by their native inhabitants (the Israeli settlers unlawfully transferred there do vote for the Israeli parliament, and hence are self-governing in a way, as the Israeli army is under the authority of the Israeli government.)
You are missing some of your history lessons: Jordan and Egypt have recsinded any claims to those territories, instead recognizing the Palestinian people as the one who should rule those territories (the same was recognized by the State of Israel in the "Declaration of Principles on Interim Self-Government Arrangements" aka Oslo I Accord)
Jordan and Egypt have recsinded any claims to those territories
That's exactly my point why those territories aren't occupied. The only way to call them occupied if Egypt or Jordan would still claim their authority over them. But as you said - they don't. Therefore while their status isn't clear, they aren't occupied. I.e. they are not more occupied by "Israeli Civil Authority", as they are occupied by the "Palestinian Authority" (i.e. Fatah and Hamas). At least that's how I view it.
Oslo agreements weren't supported by Arabs (de facto), therefore they are morally void for a long time already. UN might support the idea of "occupied" terminology, but UN isn't the only entity who defines it.
Why not? The web is supposed to be open and letting all countries have whatever TLD(s) they want makes sense.
The fact that some browsers may have font problems, etc. makes me think that countries might want to also have a secondary TLD in latin text. Perhaps they could be mapped to the same domain? I would think that a lot of countries might want this as a local pride issue.
This TLD messes up HN's title: "top-level domain representing the Occupied Palestinian Territory | Hacker News .فلسطين" I assume this is because Arabic is right-to-left, but it's interesting that only the title is affected.
Edit: This affects Mobile Safari on an iPad (iOS 5.1.1).
Displays correctly in FF 17 / Ubuntu 12.04, but only when Firefox is windowed. When maximised, the Unity bar takes over and I see the same bug you're seeing.
For what it's worth, Spain's Royal Language Academy has called the language "español" since 1923, and 89% of Spaniards reported being fluent in it in 2005. So saying Spaniards don't speak Spanish is like saying Americans don't speak English.
There are five co-official languages recognized in the various regions of Spain: Castellano (everywhere), Aranese (Catalonia), Basque (Basque Country), Catalan (Catalonia), and Galician (Galicia).
In addition, there are several localized languages, that are not "official" but "recognized": Aragonese (Aragon), Asturian (Asturias), and Leonese (Castile and León).
For what is worth, Galician is essentially a variant of Portuguese, in the sense that it's mutually intelligible with European and Brazilian Portuguese. There are differences in pronunciation but not significantly more than between European and Brazilian Portuguese.