Hacker News new | past | comments | ask | show | jobs | submit login
فلسطين. top-level domain representing the Occupied Palestinian Territory (iana.org)
138 points by matant on Sept 7, 2012 | hide | past | favorite | 69 comments



It's interesting that the Arabic TLDs aren't abbreviated the way Latin and semi-Latin (.рф) country codes are. I wonder if that's reflective of anything besides that abbreviation of words is not as common in written Arabic.


That is correct, there are no abbreviations in the Arabic language.


One thing about abbreviations in Arabic is they will be read as a word, not as distinct letters. In English, "http" can't be read as a word, you always read it as "ech-tee-tee-pee". "etc" is not read "etk" but read as "ee-tee-see". Where as in arabic, "الخ" is read as one word "ilkh" instead of as letters "alif-lam-kha".

So, to abbreviate words, you have to think hard about how people will read it. For instance, someone mentioned in another reply "Hamas" and "Fatah". Both these abbreviations read nicely as other words with related meanings. But, how would one abbreviate "Filisteen"?

If you choose the first three letters "Filis", well, that reads like "penny" and is generally used to refer to cheap things - you don't want to use that.

If you use "ft", that would read as "FiTT" فط which just sounds awful in Arabic. (it's not like "fit", the "t" here is heavy/strong).

Actually, "filisteen" is already bad enough: it sounds like "filis" + "TTeen" which is "penny" + "dirt" (we used to make fun of this when we were kids). Though in Palestinian (and the Levant region in general) they pronounce it "falasteen", not "filisteen".


Good points, although I believe "etc" is pronounced etcetera, or ɛtˈsɛtərə in IPA.


It's pronounced "et cetera" because that's what it's an abbreviation of. Most abbreviations (unlike acronyms) aren't verbalized in their abbreviated form. You say "versus", not "vs" when you read it.

However, I do often hear Unix hackers say "et-sey" when referring to the "etc/" directory because precision is important there: the directory is not named "etcetera" so pronouncing it as such would confuse.


"etc" is not read "etk" but read as "ee-tee-see".

I call /etc/passwd "etsy password".


Of course, the correct pronunciation is "filasteen" :) But I agree. Hard to abbreviate.


It must depend on your definition of "abbreviation". For example, Hamas and Fatah (the main political parties in Palestine) are both acronyms. Fatah is actually a reverse acronym.

http://en.wikipedia.org/wiki/Hamas#Etymology

http://en.wikipedia.org/wiki/Fatah#Etymology


Both words actually have a meaning, "hamas" means "enthusiasm", and "fatah" means "conquest" or "revelation".


Heh. You're kidding right ?

We're defending the rights of the oppressed minority against the Jews, and that oppressed minority that unites under the party "conquest" ? Seriously ?

If this is your name, how the hell would you defend the idea that palestinians are just trying to live independantly ?


It doesn't work like that. "We" are not just the oppressed minority, we are a people with history - you have to appeal to national pride.


Well, an acronym is a very specific type of abbreviation.


I think that's not strictly true, what about الخ meaning الى اّخره (i.e. to the end of it, meaning "etc".). In some books ح can mean حديث. But generally it is very rare.


My understanding is that there are a handful in common use, but the process is not productive as it is in most Latinate languages.


Arabic hasn't "modernised" in quite the same way many western languages have, which makes it quite interesting. Since it has stayed a written-only language for quite some time, the letters were not broken up, and its joined-up form has stayed into the digital age (at least, I assume this is why)


Not sure if "modernized" is appropriate in this context, since productive abbreviation in western languages dates to antiquity, i.e. Latin.


I don't mean just abbreviation, I mean Arabic hasn't seen certain types of changes that have occurred in Western languages. Although, arguably, Latin has always been fundamentally a non-cursive script, and Arabic has always been cursive, perhaps.


Ah, I see. I think you may have it backwards, though: script Arabic developed around the same time as lower-case letters in Greek and Latin (very roughly 0CE), coinciding one presumes with growing use of paper for writing. While Latin eventually incorporated its older "upper-case" forms with a minor grammatical function, Arabic discarded them altogether as antiquated. In that sense, one might well make the case that Arabic is in fact the more "modern" alphabet.


You're quite correct. The old 'Latin' script (our uppercase letters) where revived in the renaissance - and were slowly merged with the Hunnish script (our lower case letters) - about the same time as they adopted Arabic numerals (read right to left, for added confusion).

Other alphabets (Greek and Cyrillic) created capital letters in emulation much later (circa 1800's). Most alphabets/sylabaries (Arabic, Hebrew, Korean, Japanese, etc) don't have upper and lower cases.

Likewise punctuation, which prosidic languages (like English) need to signal stress and emphasis which can transform meaning, but which are also absent from lots of other writing systems.

I think talk of 'more modern' or 'less modern' is nonsense though. Writing is an imperfect representation of speaking, and whatever works is just fine.


Hebrew has upper and lower case.


No—there are cursive and block letters, but one is not the lower case version of the other.

http://en.wikipedia.org/wiki/Hebrew_alphabet


I don't think the timeline for the development of the Arabic script is quite right. I believe it's more typically put 3-400 CE, with the Nabatean script slowly accruing Arabic-like features over time.

Of course, my knowledge of the history of writing extends to a single undergrad course and a quick skim of Wikipedia as a refresher. I'll happily admit I'm wrong if that turns out to be the case.


I didn't mean to imply modern Arabic script came to be then -- it wasn't what we'd call Arabic today before the Quran -- only that evolution from inscribed block capitals to written script was happening at around the same time as that transition in other languages.

Anyway, in this context I'm counting 300BCE - 300CE as "very roughly 0CE".


Even so it still seems apparent that modern Latin alphabets have been adapted for typesetting, whereas even contemporary written Arabic seems to remain optimized for handwriting. It seems reading Arabic is a lot more difficult than reading Latin based writing, language issues aside, simply becuase the letters are all joined together.


Aside from abbreviations, what do you mean?


I don't see how cursive script makes Arabic less “modern.” It might make carving it onto stone a bit harder, but the printing press, the typewriter, and the computer all had no problems producing quality Arabic type.

Having written code which renders Arabic text, I found “joining” up the letters to be quite simple. It's just a few rules to choose which glyph to display for each character depending on context. The tricky bits, I found, is in integrating a right-to-left script with systems which were made with only left-to-right in mind.


Ever tried dealing with vertical column text (e.g. Japanese?). I haven't myself, but the fact Microsoft made fonts on Japanese versions of Windows have special "@"-prefixed versions rotated 90 degrees, so if a document is written in it, then changed to that font and rotated, it's the right way up, makes me suspect it's pretty difficult.


That sounds like the result of a lack of forethought when designing the font rendering code, and not because it's a big problem. I've used vertical Japanese text input extensively in Word without any problems.


mixing RTL and LTR is a pain in unix-like tools (gnu, os x).

What I usually (not always) found out diring my experiments woth Persian is that the gnu tools themselves deal with it quite fine - but the terminal application shows them at pretty random places. If there is both LTR and RTL in one file, havoc usually ensues.

(persian also has this funny "invisible space", but it causes only small annoyances)


I digress, but I think this is one of the most annoying things about Unix these days. I practically live by the command line; but terminals are terrible. We're basically emulating technology that was already getting obsolete in the 70s!

Plan 9 had the courage to shed this cruft by having simple text windows which have a prompt. No curser addressing and no crazy control codes. That makes rendering it just as easy (and beautiful) as rendering text box contents. I wish the Linux desktop environments would follow suit.


My impression is that all the major browsers show these links as punycode for security reasons. What is the point of these then? Is punycode defaulted to "off" in some countries? Wouldn't that just create two classes of users, one more vulnerable to phishing than the other?


It looks like if you add Arabic as a language in Chrome, domains aren't converted. (And actually it will change http://xn--ggblala6cyf.xn--wgbh1c/ to ستفتاء.مصر in the address bar.)


Makes sense. If your language isn't Arabic, you may be unable to, for instance, write the website address down.


I can confirm this. Arabic TLDs and hostnames stay Arabic for me.


That is not correct. For example, Mozilla takes a white list approach where they add TLDs that have registration policies that limit the risk of phishing (which is most of them).

http://www.mozilla.org/projects/security/tld-idn-policy-list...


Logically, one would think your native language's script would automatically get "whitelisted" and rendered properly, while everything else is punycoded as usual.


[deleted]


Unicode TLDs.

Obviously only individual languages are whitelisted, as others have pointed out. I wasn't thinking fully.


Even the TLDs aren't actually in the native script, they are punycoded in the spec. https://en.wikipedia.org/wiki/Internationalized_domain_name#... So the new TLD of فلسطين is really .xn--ygbi2ammx and both inputs will work, you can just use whichever is easier for you to type.


Cool beans. Glad to see IANA recognizing it. Hopefully the US Gov won't pull all there funding. </political joke>


A bigger question is if the two factions (Fatah and Hamas) will fight over control of this TLD.


Lately they have been getting along pretty well and have reconciled, but this has made Israel really angry.


If the domain is just for the occupied territories, than Gaza (and Hamas) will get a different domain.


Why?

The West Bank, East Jerusalem, the Gaza Strip and the Golan Heights are all considered occupied territories (by most of the world at least. The State of Israel calls them ``Disputed''. I'm an Israeli citizen, and I must say I disagree with the official stance of my government.)


"Occupied territories" aren't a country though. Otherwise the domain should be related to Egypt or Jordan (whose territories are supposedly "occupied").


The Gaza Strip, West Bank and East Jerusalem are not self ruled, no.

The term "occupied" relates to the fact that they are governed by the Israeli army, and are not self governed by their native inhabitants (the Israeli settlers unlawfully transferred there do vote for the Israeli parliament, and hence are self-governing in a way, as the Israeli army is under the authority of the Israeli government.)

You are missing some of your history lessons: Jordan and Egypt have recsinded any claims to those territories, instead recognizing the Palestinian people as the one who should rule those territories (the same was recognized by the State of Israel in the "Declaration of Principles on Interim Self-Government Arrangements" aka Oslo I Accord)


Jordan and Egypt have recsinded any claims to those territories

That's exactly my point why those territories aren't occupied. The only way to call them occupied if Egypt or Jordan would still claim their authority over them. But as you said - they don't. Therefore while their status isn't clear, they aren't occupied. I.e. they are not more occupied by "Israeli Civil Authority", as they are occupied by the "Palestinian Authority" (i.e. Fatah and Hamas). At least that's how I view it.

Oslo agreements weren't supported by Arabs (de facto), therefore they are morally void for a long time already. UN might support the idea of "occupied" terminology, but UN isn't the only entity who defines it.


is it a weapon of mass destruction?


Why not? The web is supposed to be open and letting all countries have whatever TLD(s) they want makes sense.

The fact that some browsers may have font problems, etc. makes me think that countries might want to also have a secondary TLD in latin text. Perhaps they could be mapped to the same domain? I would think that a lot of countries might want this as a local pride issue.


This TLD messes up HN's title: "top-level domain representing the Occupied Palestinian Territory | Hacker News .فلسطين" I assume this is because Arabic is right-to-left, but it's interesting that only the title is affected.

Edit: This affects Mobile Safari on an iPad (iOS 5.1.1).


It appears to be a bug in Safari in general. I see it on 6.0.


It displays correctly in the Firefox content and UI (17.0a2, yes I am behind), but incorrectly in the OSX title bar.


Displays correctly in FF 17 / Ubuntu 12.04, but only when Firefox is windowed. When maximised, the Unity bar takes over and I see the same bug you're seeing.


Works fine for me. (Firefox 18.0a1 Windows XP)


Will this become a trend and other countries will start registering TLD's in their native language?

Even if they use latin script, the name of the country is not always written the same in it's own language vs in English.


The creation of non-Latin-script TLDs has been going on for about 3 years now - see this list:

https://en.wikipedia.org/wiki/List_of_Internet_top-level_dom...


This is already happening. The TLD for Spain is .es


Also Germany (.de), Switzerland (.ch), Croatia (.hr), and several more. It's quite common. http://en.wikipedia.org/wiki/List_of_Internet_top-level_doma...


I don't think it's "es" for "español (Spaniards don't speak Spanish), it's "es" for "España".


For what it's worth, Spain's Royal Language Academy has called the language "español" since 1923, and 89% of Spaniards reported being fluent in it in 2005. So saying Spaniards don't speak Spanish is like saying Americans don't speak English.


Yes, I knew it was "España" but I was too lazy to get the special character.

But, um, what do the people in Spain speak if not Spanish?


There are five co-official languages recognized in the various regions of Spain: Castellano (everywhere), Aranese (Catalonia), Basque (Basque Country), Catalan (Catalonia), and Galician (Galicia).

In addition, there are several localized languages, that are not "official" but "recognized": Aragonese (Aragon), Asturian (Asturias), and Leonese (Castile and León).

And there are countless dialects of each.


For what is worth, Galician is essentially a variant of Portuguese, in the sense that it's mutually intelligible with European and Brazilian Portuguese. There are differences in pronunciation but not significantly more than between European and Brazilian Portuguese.


Castellano or Catalan, perhaps?


and also Basque (Euskara), Galego (Galician) and a maybe couple of other languages with a quite representative number of speakers.


They speak español.


That's just ISO country codes.

IE is IrEland because IR is IRan, and Iran comes before Ireland alphabetically.


I think you mean in their native character set, rather than native language


FYI, this article is 2 years old.


"Presidential deGrees"?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: