Hacker News new | past | comments | ask | show | jobs | submit login
Personal names around the world (w3.org)
261 points by gulbrandr on July 8, 2014 | hide | past | favorite | 114 comments



Another sorting oddity: you can't correctly sort proper names in Danish without knowing if the name is a Scandinavian or foreign name, e.g. via some big lookup table (or maybe heuristics). If it's a Scandinavian name, 'aa' is an alternate rendition of 'å' (retained mainly in proper names), and is sorted after 'z'. But if it's a foreign name, 'aa' is just two 'a' in a row, and is sorted at the front. Therefore the city of Aalborg goes towards the end of an encyclopedia, but the city of Aachen goes towards the front. Also the case with personal names, e.g. Kierkegaard (a Danish surname) is sorted as if it were Kierkegård, but Haas (a German and Dutch surname) is sorted as H-a-a-s, not H-å-s.

On the other hand, people are getting more used to just sorting 'aa' as 'a-a' rather than 'å', because many computer systems do so. Encyclopedias still use the traditional sort order, as do hand-ordered lists of people, but if you get a printout of students registered in a course from the registrar, it's pretty likely it'll be sorted the way an English speaker would expect. Either that, or the reverse: the computer might use a collation method where all 'aa' pairs are treated as an 'å', which isn't quite right either, and maybe actually more confusing.


Does that mean that you can use Kierkegård and Kierkegaard interchangeably? Does that cause confusion when you are trying to locate personal records?


To elaborate on what taejo said, in Danish personal names do not change spelling, even if they mean the same. It is not uncommon to see people's given names or surnames being small variation of similar names. But in these terms, you should think of »Kierkegård« and »Kierkegaard« in the same manner as 'Philip' or « Philippe ». They are the same name, pronounced the same way and mean the same, but spelt differently.

Of course, some do make mistakes and spell it in a more common way. My surname, for instance, is on occasion spelt »Utzon« because of Jørn Utzon (who designed the Sydney Opera House), who's surname sound like mine.

But the problem in Danish isn't just limited to personal names and proper nouns. In English, words like 'subtle', 'segue' and 'colonel' are not pronounced as you'd think (from their spelling), because they are loanwords, and you just need to know this. This problem is more unfortunate in Danish, as it is far more common, even amongst Germanic words.

It can be hard to explain to foreigners, why the o's in »mormor« is pronounced differently ([ɒ] and [o], respectively).


I don't know about Danish, but in German, ü "decomposes" into "ue" and is sorted that way in phonebooks. Herr Müller wouldn't be surprised to see his name spelled Mueller in ASCII-only contexts. OTOH, in some families it is traditional to write the name without the umlaut, and in these cases it's wrong to write it with, so Frau Mueller wouldn't like to see her surname written Müller (though it has the same pronunciation and origin). Famous examples are Goebbels and Goethe (not Göbbels and Göthe). There's a similar situation with ß vs. ss.

So if Danish is like German, you can't call Kierkegaard Kierkegård, but they're still the "same" surname and appear next to each other in the phone book.


That's generally the same in Danish, although one historical tidbit to add to it is that the aa->å respelling in Danish happened only in 1948. With that spelling reform, all normal words using 'aa' as a digraph were respelled to use 'å', so for example "cemetery" went from "kirkegaard" to "kirkegård". The latter is now the standard spelling, and "kirkegaard" would only be accepted in situations where it's a technical necessity (e.g. using 7-bit ASCII).

If that had been done across the board, Danish would be easy to sort, since the only digraph would have been eliminated, replaced with an atomic character. However the reform was not made mandatory for personal names; people could choose to retain the 'aa' or switch to 'å', and many retained the traditional spelling (the traditional spelling was also retained for historical figures). There was some discussion about officially reforming the names of cities to use the new 'å' spellings, but both Aarhus and Aalborg (the 2nd- and 3rd-largest cities) strongly objected. So placenames were not reformed (unless a specific locality opted in). You do occasionally see the spellings Århus and Ålborg, especially in writing closer to the time of the reform, but those spellings didn't catch on.


Actually, Århus changed its name immediately after the reform, but Aalborg did not. It was not until 2012 that Århus decided to change its name back to »Aarhus« to become 'easier to spell internationally' (which was the argument).

I still spell it Århus to taunt the idea that it was the »å« that prevented Århus from the same business opportunities as Copenhagen -- I'm sorry, København.


On a latter note: I remember experimenting with sorting of names, and discovered that Windows for instance has different locale sorting. Including one for Danish, where »aa« is sorted as one letter aligned with »å«. Of course, in this case, it was ignorant of whether the word was actually Danish in origin, or me trying to give my files sortable names (yes, a Danish installation of Windows will use this sorting in directory views).


To sum it up:

TL;DR - Names are complex, simply ask the person for an unicode representation of how they'd like to be called and use exactly that.

1. Any reasonable assumptions or generealizations that you could try to make about names are wrong for millions of people, so don't make any.

2. There is no reasonable way to automatically split a person name in meaningful parts, because there is no such thing as a single universal ontology of meaningful name parts that will fit everyone. Treat personal names as indivisible, immutable items; don't try to separate them in parts.

3. There is no reasonable way to automatically generate a short-polite-addressable form from a full name, aside from (a) asking the user or (b) limiting yourself to a whitelisted subset of names that won't work for millions of USA citizens, much less globally. So don't even try.

4. Don't assume that you can infer [non]existence of family relations from their names. You can't.


> Treat personal names as indivisible, immutable items; don't try to separate them in parts.

Unfortunately, that's (often) not an option; as soon as you have to interact with external entities such as banks (but not exclusively them), you're going to be asked for the first and last name of the user.

EDIT: and, by the way, I think that's horrible. Mangling someone's identity like that is pretty bad.


Don't treat those as personal names, then. Expose the bank's interface to your user. If the bank is asking for "first and last name", then say, "the bank is asking for your first and last name; what do you wish us to tell them?"


The customer doesn't really care if it's going into a bank, CC processing (my case), a hotel registration system (also my case), or a soulless SAP installation somewhere in the guts of your company (yes, this is also my case). They're also used to dealing with this (in my opinion unfortunately), but usually not that interested in reading excuses.

FWIW, we only ask for first/last name where relevant (i.e. billing information), but splitting hairs and asking for both is more bother for the user than it's worth.


> but splitting hairs and asking for both is more bother for the user than it's worth.

Then don't. Be honest and don't ask for personal names at all. Ask for billing information.

The "name" fields in billing forms are not personal names. They're keys by which the billing system can correctly identify the account to access. Don't try to pretend that you know someone's name just because you have their billing information. That's lying.

Especially in cases as simple as a customer who is using someone else's credit card to pay for your service. That name is definitely wrong even if one is John Smith and the other is Jane Doe.


I can only agree. It's absolutely horrendous UI to ask for anything other than "what is your name?". It's incredibly unfortunate that various systems that one has to integrate with all have their own inadequate representation of names (given/last, given/middle/last, etc.) when everybody would probably be happier with a single Unicode string "name". :(


I agree, a "how should we address you field" is ideal. I'd like to add though that in case you might have to call them at some point, maybe ask them for a transliteration too where applicable (eg Japanese).


Yes, that's the correct solution for asking for a name. As an example of this, as a side project I'm building an app for my wife (or other teachers) to collect assignments from her high school students and mark their work. In this case, there are separate fields to ask teachers "What name do other teachers call you?" and "What name do students call you?". Both are single, free form text fields that take any UTF-8 text of a reasonable length.

This is especially important for the school environment because the same teacher might be "Joe Schmoe" to other teachers, and "Mr. Schmoe" to students. I also know of some women who married and changed their name, but still use their maiden name with students. Other teachers are casual with their students and have them use their first name. If they teach a Judo class at the school, they might want to be called something like "Sensei Joe".

At any rate, each use case for a person's name should be asked separately, and used verbatim -- without trying to break them up at all.


For some applications a single "name" is good enough, but in other cases, even if your application doesn't interpret it, some downstream consumer of that information (either a person or a computer) will interpret it, like it or not, and may well be more likely to screw it up if there's no additional information (family/given distinction etc).

I think there probably isn't any universal solution that will minimize screwups in all cases, and you do need to look at your usage and audience... ><


> Names are complex, simply ask the person for an unicode representation of how they'd like to be called and use exactly that.

Right, sometimes the best way to model a complex thing is to simply provide a free-text field called "description".


One more interesting implication of 2: you might want to let the user specify what to use for sorting. Although that seems problematic to me.


Internationalized sorting is a whole new can of worms, and there are mutually exclusive rules in different languages.

http://www.unicode.org/reports/tr10/


And (for names, at least) in the same language in different contexts. See https://en.wikipedia.org/wiki/Mac_and_Mc_together (poor article, but the gist is there)

I believe the same is true of de~ and van~ surnames, which are sometimes ordered by the first capital (i.e. van Houten comes before de Koninck), and sometimes not.

This means that offering a user the option of presenting a sort version of their name still leaves them in the dark as to what comes before or after it.


Which is why, I guess, going on 4 years now Mongo has has an open issue to implement unicode sorting and still has not.


No, this is overly simplified and W3C recommends nothing of the sort.

The article helps you analyze your project's requirements (what data do you need to capture? what is your audience? how much resources can be spent on nice UX? etc.) and come up with a solution appropriate for these particular requirements.


I was about to write the same thing, I'll add that the takeaway is basically: keep doing what you've been doing and handle outliers singularly/let people figure it out themselves. It worked pretty well this far, it will keep working well the future.


It did not work pretty well so far. It works horribly wrong for most of the people in this planet.


Fascinating article.

Back in the early 1980s, my sleepaway camp went to Boston, where we visited the Children's Museum. It had a computer with which you could interact.

Ever the aspiring nerd, I went over the computer. It asked me to enter my name. I did so.

It answered: "Reuven isn't a real name. Please enter a real name."

(I ended up entering "David," my counselor's name, just to get through that first screen.)

Be very careful when you vet names; for format, spelling, or anything else: As this article points out, there is a lot of variety among names in the world.

However, even well-meaning people can make mistakes, and it's useful to be forgiving. I have been teaching programming courses for years, and in one of my examples, I create a Person class. However, when I got to China, I realized that all of my slides talked about a "first name" and a "last name," which was both confusing and backwards from what my students expected. I'm working on updating these slides, so that they will be appropriate for all of my audiences. But as this article points out, it's a struggle to do so.


Unbelievable, especially in the days of few-hundred-kb memories, that someone had that idea and thought it was good.


They were probably concerned about people entering offensive words phrases that would presumably later be spit back by the computer so they went with a white list.


Well, if you restrict people to a list of 65536 names, you can just use the index into the list to save space on representation ;)


Some countries have a list of "approved" names, and it's very hard to register a name not in it, for example in Argentina, every province has a list of approved names, and you have to write a letter with the etymology of the name for it to get approved:

http://www.planetamama.com.ar/nota/ley-nombres-como-registra...

Law:

http://www.sdh.gba.gov.ar/comunicacion/normativanacyprov/pue...


How do these countries handle immigrants from countries where there are no such regulations?


Usually, I think, by the simple expedient of allowing them to keep their names. However, ISTR that in Thailand there is a requirement for anyone who becomes a Thai citizen to adopt a new, unique Thai-style surname (only people who are actually related may share a surname).


I am a naturalized US citizen who was born in the former USSR republic, which then became the independent country Ukraine. (PSA: it's pronounced ukrAIne, not UKraine, and definitely not "the UKraine"). My original name was Igor Andreevhich Partola, where Andreevich is my father's name with a -vich ending. My birth certificate was issued in the USSR with this name, but since the USSR was no more, and Ukraine started using Ukrainian as the official language, before coming to the US, my middle name on my birth certificate translation got changed to Andriyovich (Ukrainian version). Thankfully my first name did not have to change.

When I got to the US, I stopped using my middle name altogether, both to avoid having to spell it on the phone, etc. and to minimize this type of confusion. In most cases I had no issues, but the immigration process was stumped by this. In general, immigration officers are not really prepared to deal with foreign documents and foreign names. You'd think by now they would have seen it all, but seeing a birth certificate from the USSR and a translation of it in English was something that stumped most of them. Going from being here on a visa to getting a green card is a grueling process because of issues likes this. By contrast, going from green card to citizen is a cake walk since they can simply refer to your US-issued green card as a form of ID.

P.S.: I am lucky that my last name does not change between genders. For example Ivanov and Ivanova is the same last name but one is masculine one is feminine. I have heard from friends about having issues with this too.


> my middle name on my birth certificate translation got changed to Andriyovich

To anyone else following along (since I'm sure a Ukranian understands how Ukranian names work): this is not a middle name, but a patronymic, similar to Icelandic "surnames" which are really all patronymics too. Russians and Ukranians and other Slavs use "givenName patronymic familyName", where family names are a relatively recent invention (so the traditional formal form of address is "firstname patronymic" instead of "Mr familyName").

In Lord of the Rings, these patronymics are played with in names such as Aragorn son of Arathorn and Frodo son of Drogo.

> PSA: it's pronounced ukrAIne, not UKraine, and definitely not "the UKraine"

Why are Ukranians so insistent on not being called "the Ukraine"? I honestly do not understand. There are no articles in Slavic languages except Bulgarian, so why do they care so much if an article is used in English?


Because "the Ukraine" makes it sound like a territory, like "the South West". Ukraine is an independent country. You don't say "the England" or "the Canada". So please stop using the grammatically incorrect "the Ukraine" in favor of the correct "Ukraine".


Ah, I see. It was a territory and no longer is. And some Ukranians don't like the suggestion that they're somehow part of Russia, like the etymology of their name might suggest.

We do say "the Netherlands" or "the USA", though, so it's not gramatically incorrect to use "the" for some country names, it's just a matter of politics.


I am not actually sure why it's "the Netherlands". I suspect it's because "Netherlands" means "low countries", and countries is a common noun. In that light it could be similar to "The United States" or "The United Kingdom". Here you are referencing common nouns "state" and "kingdom", but proper nouns should not include an article such as "the". "The" implies "specific instance of" vs "a" which implies "any one of many". There are many "kingdoms", but we are talking about "United Kingdom", a very specific kingdom.

Long story short, I don't think it's just political, but based specific rules of the English grammar. Further reading:

http://time.com/12597/the-ukraine-or-ukraine/

http://en.wikipedia.org/wiki/Name_of_Ukraine


Well, "Ukraine" probably means "the region" or "the border" or something like it, so the name sounds like something generic too that would need an article to disambiguate.

It really is politics, you can't be objectively correct here. It's just like me trying to convince you in accordance with my Mexican upbringing that America is not a country. I will not succeed, because the distinction is political. Since human languages are also consensus-based, grammar is in a sense also political (e.g. sociolinguistics).


By that logic we should all be saying "the England" because England is a "land". No, there is one objectively correct way of saying the name of a country (in English), and there are even explicit grammar rules for this specific instance. See the two pages I linked above.


Kind of related, in German we mostly say "die Ukraine" (the Ukraine", exactly the same as "the Netherlands" and surprisingly Slovakia (and former Czechoslovakia, but not always "the Czech Republic", only when we say "die Tschechische Republik", not "Tschechien" (which is informal, to be fair, as is "Slowakei" - but afaik "Tschechoslowakei" was official). Did I confuse you already? There's also "die Elfenbeinküste" (Ivory Coast), and certainly a few more.


"die Elfenbeinküste" (Ivory Coast)

My German is pretty bad but it looks to me like "elephant bone coast" in German.


Ivory is elephant tusk, and people in the past didn't always distinguish between tusks, bones, and teeth.


Looks like Google Translate disagrees with you: https://translate.google.com/#auto/en/Elfen%20%0Abein%20%0Ak...

Elven leg coast. I guess this is why I've been told off by German speakers in the past for deconstructing words like that.


Yes, you shouldn't do this. ;) In this case "Bein" (or in the long form "Gebein") is an outdated term for bone or skeleton.

"Elfenbein" is the modern word for ivory. The meaning of Elf is somewhat similar (though not identical) to English and therefore refers to the perceived beauty of ivory.

A tusk on the other hand is called "Stoßzahn" recognizing it as a tooth ("Zahn").


My German is not so great but bein means both leg and bone. I know that much. I just wasn't sure about the "elfen" part. So glad to see someone else provide some info on that bit.

FYI: Google translate is not the most reliable translator.


For the US, the full name is "The United States of America". In this name "State" isn't a common noun, it has a specific legal meaning identifying one of the 13 originally independent countries that joined together to form The United States of America. I'm not sure why they were never called countries, but that's essentially what they were, and to a limited extent they still function as small(ish) countries.

This is the same as saying "The European Union" rather than "European Union".


This article [1] attempts to clarify, while actually muddying the waters considerably from my point of view.

In addition to "the United States", "the United Kingdom", they list "the Congo", "the Bahamas", "the Netherlands", "the Phillipines", "the Sudan" -- the article also lists others, like Gambia, Yemen, and Lebanon, where I personally am not as familiar with the use (although I've heard "the Yemen" before, probably from that movie about salmon fishing).

[1] http://www.bbc.com/news/magazine-18233844


You do say "The UK," "The USSR," and "The United States" though. I wonder if, because of this, it's a habit among English speakers to refer to countries who's names start with that U sound (juː according to wikipedia) to put the article in front of it.


This native English speaker would never think to say "The Uzbekistan" or "The Uganda" though. All of your examples are country names that are compound names, and hardly natural-sounding names at all.


Nothing like that at all - it's quite simple: a name that is a collection of things can be 'The [collection of]'.

  - The Netherlands = The [collection of] lands that are 'nether' (=associated with reclamation from the sea, presumably)
  - The USA = The [collection of] states 
  - The UK = The [collection of] kingdoms
  - The USSR = The [collection of] soviet republics
The country called Ukraine is not a collection of ukraines, so it shouldn't be The'd. :)


Habit? I've yet to hear anyone say "The Uganda", "The Uzbekistan" or "The Uruguay". I think having United precede is what makes the difference.


In those cases, the elements of the full name of the country in question are each common nouns rather than proper nouns. Thus, "The Union of Soviet Socialist Republics" is more useful/appropriate than "Union of Soviet Socialist Republics" because without the "The", it's not clear that the name describes a unique entity rather than a generic union of soviets.


I am pretty disappointed with our name handling also. I was helping an Iraqi move to the US, and someone translated his name as Hogar instead of Hoger on one document, and it cost him a few months. As far as he was concerned his name doesn't have any english letters in it, so Hogar and Hoger are just about the same.


To me the best solution really would be to issue documents with two fields: your name as an unbreakable set of characters in your mother tongue, and pronunciation of your name in Latin1. That way I could be:

Игорь Партола / Igor Partola

The trouble with this is when you have to spell your name over the phone. Nobody would figure that out. One day that won't be a problem. That'll probably be the same day that all manner of government computer systems stop using all caps.


Hah, anecdote time.

Lived in Israel for a year, needed to fill forms (visa, bank account etc). My name is Benjamin, but the pronounciation is different from English (ja is different, in english it turns to a weird dj as in djungle). Talking english in Israel lead to the transcription of the _english_ pronounciation (they have a special way to encode just that sound as 'ג). So, I have a name that IS more or less hebrew, and exists in a mostly equivalent way in IL, but it was written in a wrong way.

Anecdote 2: Since this dj sound doesn't really exist in Hebrew and needs to be encoded as (in latin letters) g', some programs don't understand that. Copy/pasting one Hebrew mail I got into Google translate lead to my name being translated as bang sex (name in latin latter would kinda look like bng'min, the ' was considered as a delimiter and so bng became bang, and the rest was taken as a separate word). I was confused, my coworkers amused.

My last name was written differently on my cc, my healthcare documents and my apartment contract. Weird times.


Similar for Polish, basically all the tails got dropped when data was moved from one form onto the next, no trouble really except for Ł and ł. The second would often be entered as k instead of l. Lots of trouble for many people later cause of that.


OT and nitpicking but the part about Icelandic names is actually slightly incorrect. The typical Icelandic "surname" is made up of the genitive case of the father's name (in rare cases the mother's) followed by "son" or "dóttir", not "sson" or sdóttir". It just so happens that the genitive case of most names ends with an "s", e.g. Gunnar->Gunnars or Guðmundur->Guðmunds. There are however many names where this is not the case. My name for instance, "Örn", is "Arnar" in genitive case and thus if I had a son his surname would be "Arnarson" and my daughter's surname would be "Arnardóttir".


what do you do if the parent has a foreign name?


I am far from being an expert but I think the most common thing to do is to just take on the surname of the parent as a sort of family name. So if Björk's father was John Smith then she would be Björk Smith. Family names are not unheard of and there are even a few Icelandic ones (although they are definitely the exception). Name laws in Iceland are really strict[0] but they mostly regard peoples "given name". I think you actually get to choose the surname of your child. I have seen cases where people have gone with the "-son" or "-dóttir" formula, typically in cases where the foreign name has an Icelandic counterpart, e.g. "John" and the Icelandic "Jón". The surname of the child could then be either "Johnson" or "Jónsson".

[0]: http://www.theguardian.com/world/2014/jun/26/iceland-strict-...



I too though of this post while reading the article. Great insight on both counts, but I appreciate how the W3C article gives some strategy for how to account for most cases that you will run into. Trying to turn Patrick's list into a practical design is quite an undertaking.



...and it gets my gender wrong (USian bias, I suppose?). A nice idea, but not very reliable.


Which makes the mistake of defining a "first" and "last" name. What's a "first" name?


That got my name wrong. Damn.


At least once every 2 years I have to deal with trying to explain why the following set of requirements is a bit of an issue:

1. We want separate first name and last name fields, but no additional fields.

2. We want to be able to sort by last name.

3. We want to go international.

The worst part is: this happens in the Netherlands, where "van" and "de" are common and people should really know better. It is so hard to explain why beyond a simple "full name" field naming conventions are tricky to design and code for.


My legal name is "William Godfrey". (No middle name.)

My stage name is "Bill P. Godfrey". (Because Google have too many called Bill Godfrey ahead of me.)

In an informal setting I prefer to be called "Bill", 4 characters which appear nowhere in my legal name.

So yes, please have "What should we call you" field on your forms.


And also, if you ask "what should we call you," DO NOT use that as any sort of unique identifier. I just registered on a site that did not let me use my given name as "what should we call you" because it was taken. The stupidity.

If you need something unique, generate it, and keep it to yourself. I don't care a whit about your database.


It sounds like the form you filled out just poorly labelled their "Pick a Username" field; The concept of a website generating something to call you is interesting though. It could lead to hilarious scenarios.


No, the login is your email. Or is there a login and a username? Stupid again.

And no, I don't mean generating something to call me. I'm assumming they need a unique digital identifier, and they're just falling back on "name," whatever "name" means.


>> "Or is there a login and a username? Stupid again"

Lots of sites allow you to login with a username or an email address. I agree it can complicate things but is you need an email address to validate a user is real and a unique username for the service (e.g. it's social) allowing either for login will probably result in less forgotten logins.


How does an email address validate that a user is real?


Not real, I meant 'not a bot'. Ask for email, send a link that must be clicked etc.


That doesn't answer the question. There's no particular reason a bot can't have an email address.


Maybe I'm wrong, I was always under the impressions one of the main reasons to ask for an email address was to make things more difficult for bots. Rather than just setup a script to create accounts on your server they would also need a valid email address for every attempt.


Try Star Citizen, which needed an email, login name AND username. Insanity.


And they have to be all different IIRC.


When I see forms that ask "what should we call you?" I have a very hard time (and often fail) resisting the urge to put "Captain ..." or "Beloved Leader ..."


> People in Korea, who typically do have 3 names but who don't usually initialize them...

No, not really. This is a completely wrong interpretation of Korean names. Most of us have two names: family/clan name ("last" name like "Park", "Kim", etc) followed by given name ("first" name like "Chan-Ho" or "Geun-Hye"). The given name typically consists of two parts, but that doesn't mean we have three names.


I was always curious as to why nearly every Korean person I've met has "two" first names. According to Wikipedia [1], traditionally Korean names were "generation names", where everyone in a specific family of a specific generation shares part of the first name. The example they give of an equivalent Chinese name would be someone named Xia Zhou-jin might have a brother named Xia Zhou-sui, and children Xia Han-zheng and Xia Han-Li (in this case Xia is the family name, Zhou and Han are generation names, Jin, Sui, Zheng and Li are personal given names).

I'm curious - is this no longer the case, or do you (either you personally or, if you can speak to it, the zeitgeist in Korea) not think of the "generation" part of the "generation name" to be a separate name?

1. https://en.wikipedia.org/wiki/Korean_name#Given_names


The "generation" part is only part (one syllable) of the given name and doesn't constitute a separate name. The "yong" in my name is "generation name", but no Korean will call me "Yong" any more than an English speaker would call Richard "Chard".

Because most Korean names are neatly split into three characters (each character is a syllable in Korean), many people also write their names in three space-delimited words in Latin alphabet. And then westerners get confused and end up with patterns like "Gil D. Hong". (I'm not blaming them; you can't expect everyone to understand all the world's naming systems.)


Just to push back on this a bit, the fact that the "generation name" part of the name carries some additional semantic information in some ways makes it even more of a name than a "middle name" that is somewhat popular in the West. In a sense, my name is my full name (first, middle and last), but it is split into three parts. The last one has some meaning (it's a family name), the first two are given names, and the cultural default here is that I would go by my first name, and if a conflict is found, I'd go by the first + last, and if a conflict still exists, you'd default to first + middle name or first + middle initial. I don't think it's that unreasonable to consider generation names to be "compound names", consisting of two sub-names, even if the cultural default is to always use the full compound and not either of the parts.

I also think that "Richard"->Chard is a disingenuous choice, because people named Richard often go by Rich. People named Andrew very frequently go by either Andy or Drew, similarly people named Alexander often go by either Alex or Xander. Of course, there's no semantic meaning associated with the component parts of those names anyway, so it's not like you can infer something from the fact that someone is named Alfred and someone else is named Albert, and either one might go by "Al".


Well, OK maybe Richard was a poor example. What I'm saying is: some Richards go by Rich, but few goes by Chard, and certainly nobody interprets Richard as a combination of two name components "Ri + Chard" (or "Rich + Ard"). The probability a typical Korean would consider their given name as a combination of two parts is probably higher than that of Richards, but not much higher.

Maybe a better example is Anderson, which historically meant "Anders's son", but few living Andersons would consider "Anders" an acceptable way of writing their family name.


That's because it's really a combination of "Ric + Hard".

Seriously: https://en.wikipedia.org/wiki/Richard

:-)


FWIW, my parents didn't give me a generation name, and my kids don't have one. I have cousins and uncles (I think this is for males only? I could be wrong) who do have generation names. From what I understand, this practice is slowly dying out in favor of better sounding or more native-Korean names (without Chinese characters) in recent generations.


With the spread of the LGBT+ movement, the connection between titles, personal names, and gender can also change. Some people would rather not identify as being male or female, and instead identify as a specific gender minority. This has different conseqeunces for different languages, since some have gender-specific naming and others have gender-specific titles, prefixes, suffixes, etc., but the fact is that most websites ignore gender minorities.

There is also the distinction between gender and sex, which websites often neglect (in some cases to the detriment of intersex individuals).

I would love to see either more websites that, like Facebook, offer treatment of gender as more than just a binary data value, or forgoe it altogether. Of course, gender is always going to bring up problems when attempting to translate both names and standard text.


Thais don't have family name system until early 1900, everyone was named by short syllables like Daeng, Maew, Pu, Chai, etc. Once the family name system is introduced, the custom of naming someone using short syllables continues in a form of family nickname. The trend continues to today, thus Thais have a nickname that don't related to the full name.

Except in formal situation, everyone here would refer to each other with their nickname. Calling Thais by first name or family name is equally awkward (even more so for ex-nobles, who have "na $Location" suffix in their family name, e.g. "na Ayudhaya"). So yes, please have a "What should we call you" in your form.


What's the advice if you have a business requirement to support both formal, informal and full names?

For example, the order confirmation e-mail might say "Good news Joe, your order is on its way" while the apology e-mail might say "Dear Mr Biden, we're sorry but your order isn't on its way" and the parcel might say "Joe Biden, White House, Washington DC"

Should you ask the customer to fill out three different fields?


If you want to be international, you don't even try that. You let the communication say "Dear $Name" and you are done. Even if it means none of the emails say what you want it to say. And yes, even for a US system I don't think it's a good idea to ask for first/initial/last or to ask for formal/informal. Chinese users in the US will be happy to just see a "Name" field. Just ask for a name!

I'd stay away from gender/greeting too. Mr/Ms/Mrs/Dr etc. are way too culturally narrow even for the US.


This might break if the language the name is in has a vocative case http://en.wikipedia.org/wiki/Vocative_case


interesting. If the message is in English, would such rules still apply? We are talking about formatting a message with a foreign name (origin unknown) in a known language, e.g English.


Not normally. Typically you'd pick the 'least marked' version of the name and just use that.


> "Joe Biden, White House, Washington DC"

Speaking of web forms and DC, I hate when they have "Washington, DC" listed as the state. Never mind that it isn't a state; at least the territories have to put up with that, so we're not alone. But DC should be in the Ds, not the Ws. Washington is a city, not a state(-like thing), even if it isn't distinct from the district anymore. Well, Washington is a state, but… you know what I mean.


In such a case I would try to change the requirement. This is probably a somewhat unpopular opinion, but if I am giving a name to a system I'm going to put in my full name and that is what I would expect it to always use to refer to me by.

Having had some pretty irritating experiences with systems that would only retain or use a piece of the names I gave them (either a family name or given name), resulting in a collision, whenever I feel that is what might happen I give them a concatenation of my names in different orders for the different fields, so if I was John Doe I might enter "Doejohn Johndoe". Apparently I'm not the only one doing this.


Is it a business requirement to offend people? In some cultures "Good news Joe" is offensive. In others "Dear Mr Biden" is offensive.


>Is it a business requirement to offend people?

Yes, it often is. Years ago in college I worked in an international retail establishment. We were required by policy to address customers by their first names. It was a rural community with many older people, and many people were visibly unhappy being called their first name by a 19 year old kid they didn't know. And as an adult, I find myself in agreement to this day. If I don't know you, why are you calling me by my first name?


You rethink that business requirement, because it's virtually impossible to get that right against the full spectrum of names.


Yes. There is no other way to get all three cases correct.

(If this is a burden on your users, you should consider ways to drop this requirement.)


The first form we have to change is the birth certificate. When my daughter was born, we knew that she was going to have 4 names (2 given, 2 surnames). My wife is from the Philippines, and it is typical there to have 1 or more given names, and then maternal surname and paternal surname. It is common for people to have several names (e.g. the signer Sheree is actually Cherry Hazel Sweet Fae Bautista Augustin). Unfortunately the birth certificate only had first, middle, last. We didn't want to hyphenate the last name so we had to either have a 2 word first name or middle name. In the end, we went with 2 names for the first name, maternal surname as middle name, and paternal surname as last name.


My first name is hyphenated. I still run into web forms that insist that having a hyphen in your first name is "invalid" or that my first name is too long for the form. So I get mail addressed to all sorts of variations of my name not to mention people that "correct" my name by thinking that it is my last name that is hyphenated. And there's also the fact that search engines seem to ignore a hyphen so someone searching for me will get lots of results for someone whose first and last name make up my first name. That is real fun when that person is a famous kidnapper...


I don't like the "Full name" label. It's very uncommon to be asked for your full name in the Netherlands. You use initials in almost all formal documents. Why not "Name"?


I loved that after moving in! The culture I'm from is actually more compatible with most entry systems than the Dutch one (no vans or anything), but my full name is long, and doesn't fit e.g. on credit cards. In Netherlands, nobody bats an eye when I just reduce my names to a bunch of initials.


Fun fact: Pocahontas's given name was Matoaka. "Pocahontas" meant "little naughty one" and was a nickname, but among the Powhatans it was customary to give one name for ingroup people to call you and another for outgroup people to call you; and "Pocahontas" also filled that latter role.

Complicating matters further she converted to Christianity and took the name Rebecca Rolfe.


Great article. This is forcing me to ask myself why I would ever need to separate first and last names. Maybe it is time to stop.


Don't forget about catering for the symbol formerly known as Prince :) No but seriously, names are horrendously complex, just as adresses are. But you might be better of ignoring some detail than trying to cater for every possible case.


I tried something similar in my last organisation using millions of social media profiles : check out http://whatsinmyname.prokta.com


Here is an interesting video on names by Django core dev Russell Keith-Magee:

http://www.youtube.com/watch?v=KHg6AoExYjs#t=125


I wonder how much people adapt to things though... I work with a company that sells something with "first name", "last name" fields. And they do manage to sell it around the world. To my knowledge (I'm not involved in sales), the fact that the name fields are very Europe-oriented has never really been a problem: people buy the system because it's very good at other things, not because it's good at record keeping. In a newer version, we're considering dropping the first/last name, but some people are wary of that.


Well, people are flexible enough to mangle their names to fit pretty much any arbitrary criteria, and that is sufficient for selling & shipping goods. E.g. I can misspell a few letters of my name to fit ASCII and invent a middle initial if it's required for some stupid form.

It simply leads to names that don't really represent reality, and unhappy customers - people often are quite attached to their names and like to see them properly.


I didn't have a legal middle name until this year and I lost count of how many imaginary middle initials and names I've had to make up until now.

Then... My husband decided to hyphenate one of his names and Social Security explicitly didn't use the hyphen even though it's on the marriage certificate. I wonder what DMV and the Department of State will do with his documents next.

Even the author of the original post kinda gets it vague by saying Korean names are three names - maybe technically so - but the middle name is not a middle name like in the US sense. My parents' given name is the first+middle put together unless you're looking at their English names that got separated out. My name they chose on purpose to match in both languages - Jane in English, 제인 (jae-in) in Korean. My nickname Janey works in Korean too - 제인이. ;)

Speaking of Korean, I'm not sure there's even a process for changing my Korean name like there was my English name since it's rare for women to change their names after marriage. I'm looking into dual citizenship right now. The idea of having passports with mismatching names is kinda sad and funny at the same time.

Names are hard but there's definitely a nice feeling when people don't screw it up. :)


The point was that customers might not care too much if they just have to put something in a form, if what they are really after is much more important than getting their name right. And some customers might be happier having things in a "familiar" format.


People adapt fine. If you put leg irons on someone and tell them to pick cotton all day, chances are they'll adapt to that fine. That doesn't make it okay.


Really? That's the example you gives when relating to problem of storing peoples name?


No, its the example given to the suggestion that people from other cultures than yours should just shut up about how their name is stored because you don't care about their culture (and by extension about them, specifically because of their culture).


Let's just assign a UUID to every baby born and nature solves the problem in about 100 years. ^^

...historic data you say...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: