Hacker News new | past | comments | ask | show | jobs | submit login
The World Atlas of Languages (unesco.org)
44 points by neom on May 16, 2023 | hide | past | favorite | 25 comments



Geekier but immensely more detailed: the World Atlas of Language Structures. https://wals.info/

Has both maps like https://wals.info/feature/13A#2/19.3/152.9 and detailed language feature inventories like https://wals.info/languoid/lect/wals_code_eng


WALS is great! A warning though: the individual data points aren’t always completely reliable. I regularly attempt to track down specific language features listed in WALS only to find that half of them don’t correspond to what’s in the source (which to be fair is quite characteristic of big linguistic databases). That being said, the overall geographic trends in the maps are still pretty reliable, and the accompanying ‘chapters’ [https://wals.info/chapter] are amongst the best free linguistic resources I’ve yet found.

From the same group, there’s also https://glottolog.org/ for general information about language families, and https://clics.clld.org/ for lexicon organisation. All these are part of a broader set of linked datasets listed at https://clld.org/datasets.html (I’ve found PHOIBLE particularly interesting, which collects phonological information). Again, the same provisos apply: generally helpful on average, but take specific data points with a grain of salt.


Sampled a few languages - this data set looks very bad. Better use https://www.ethnologue.com


Good grief - Australia and New Zealand (3) [1] !!!

My map suggests that Australia alone has rather more than 3 [2].

It's missing a host of Australian languages and simply not GIS mapping others, eg: Yolngu Sign Language [3] should be tied to Elcho Island [4] & surrounds.

Speaking of Yolngu .. [5]

[1] https://en.wal.unesco.org/discover/languages?f%5B0%5D=locati...

[2] https://mgnsw.org.au/wp-content/uploads/2019/01/map_col_high...

[3] https://en.wal.unesco.org/languages/yolngu-sign-language

[4] https://en.wikipedia.org/wiki/Elcho_Island

[5] https://www.youtube.com/watch?v=bdpoWcma4HE


Kinda useless just has the "name" of the language. It's not a world atlas but just a massive list with a simple search...


I emailed them to ask them why France wasn't listed as a country that uses French, here is the reply:

"Thank you very much for your email and for the interest in the UNESCO’s World Atlas of Languages (WAL)

Let me give you a small context about the WAL,

UNESCO Member States are asked to nominate Focal Points for the Atlas. These focal points are responsible for providing information on the linguistic diversity of each country.

As for France, we are currently discussing the nomination of a Focal Point, which is why there is no record of French or other languages in the current profile.

The Atlas as mentioned in the terms of use is a work in progress and the purpose of nominating Focal Points is that the information is up to date and from official sources.

Once the Focal Point will be nominated, the information will be published."

Looks like it's a WIP, I submitted it because it seems like a decent tool to see the scope of endangered language, as it was a lot broader than I'd imagined.


Spanish Catalans will not be pleased by reading that Catalan is only mentioned to be spoken in Andorra.

https://en.wal.unesco.org/languages/standard-catalan


Why such a dearth of sorting/filtering options?

I've noticed this trend for several years? Who decides that "oh - they'll only need alphabetical"

This stuff is (usually) trivial to implement and it was the first thing I did when learning CRUD development.

LET ME SORT BY ALL THE THINGS!


Is there a visual map of how densely distributed various languages are geographically?

Current day metropolis like NYC will obviously have peaks. But the general distribution of number of languages spoken would also be interesting to see.

Even more interesting to see would be a time-lapse of how the distributions changed over decades.


Apparently there's only 3 languages in Australia/New Zealand, and they're all sign languages.

No mention of the 200+ endangered languages in Australia, or even Maori in NZ, where it is an official language.

Might be a better site when the data is properly populated.


typed in 'hebrew':

got inaccurate number of speakers, told it is spoken primarily in poland and turkey.


And Yiddish (as opposed to "Eastern Yiddish" and "Western Yiddish") has "99,999,999" speakers in .... "Sweden"!


Remember that's the Unesco who wrote this, and they have an anti-Zionist political skew.

They make a point of separating "Modern Hebrew" from "Hebrew".


They're also separating "Chaouia of the Aures in Algeria" https://en.wal.unesco.org/countries/algeria/languages/chaoui... from "Chaouia of the Aures in Algeria" https://en.wal.unesco.org/countries/algeria/languages/chaoui... so I don't think any political skew is involved at all, their data is just a mess.


Hebrew is a much broader linguistic phenomenon than just modern Hebrew. Hebrew existed as a spoken language thousands of years ago and has existed continuously since then in many different forms in different Jewish communities primarily as a liturgical language.

Modern Hebrew (or probably “Israeli Hebrew” would be more accurate) was then constituted as a vernacular fairly recently as part of the Zionist project of creating a new national identity, distinct from the existing identities of the Jewish populations would emigrated to Israel.


They also separate Ancient, Mediaeval, and Modern Greek. Is that a political agenda too?


I guess constructed languages are not languages. Esperantistoj! Ĉu vi aŭskultis tion?


I don't quite understand the criteria of these but it seems all of the sign languages are "Endangered/unsafe"

That seems really suspicious to me.

Even so far as Australia being "Definitely Endangered"

Australian Sign Language Definitely endangered.


Apparently it shows Cornish on the map for each "<language> in <place>" to me (FF 102.10.0esr, JS allowed on that page), and nothing otherwise.


Looks like this is WIP and far from complete.


Incredibly crappy. It is obviously constructed by asking countries themselves about the languages.


The number of speakers indicated for various languages is off by several orders of magnitude. This is garbage.


"Bosnian" and "Croatian" are different languages? Really?


If you look only at linguistics, the current version of standard Croatian would be a dialect of Serbian, but naming languages was never a pure linguistical category. That's why we have languages where their name is the most significant difference.


“A language is a dialect with an army and navy” (https://en.wikipedia.org/wiki/A_language_is_a_dialect_with_a...)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: