Hacker News new | past | comments | ask | show | jobs | submit login

One of my favorite Unicode oddities is in the cyrillic block. Some books used to use the letter "ꙩ" when writing the word eye (ꙩко). The letter was (from what I can tell) never used for anything else. Because dual is a thing, clever people then went to write ꙭчи or Ꙫчи to mean two eyes.

Both those characters made it into Unicode as there was some use in historic script. However even completely absurd variations made its way into unicode. The "many eyed seraphim" is written as серафими многоꙮчитїи. So if you need to write something with a lot of eyes, you can use ꙮ.




These four Unicode symbols have a similarly obscure origin: EDIT: automatically removed by HN

https://en.wikipedia.org/wiki/Go_(game)#Notation_and_recordi...

Relevant thread on the Unicode mailing list, with Subject "Purpose of and rationale behind Go Markers U+2686 to U+2689" is here: https://unicode.org/mail-arch/unicode-ml/y2016-m03/thread.ht...


> EDIT: automatically removed by HN

I'm curious about this. Was your post removed for containing certain unicode characters?


Probably not the whole post, just the uncommon characters. A few days ago HN deleted several characters from my comment immediately upon posting.

https://news.ycombinator.com/item?id=24851415


Just tested it, tried a comment with an emoji and it never appeared. So I'd assume so.


I googled up the paperwork proposing the inclusion of ꙮ and variants a few years ago and got the impression that a) you're not supposed to include something that only has a single recorded use and b) ꙮ pretty much has a single recorded use.


Interestingly, the phaistos disk[1] has its own little bit of Unicode and from what I understand the only known “original” use of all of the symbols is the phaistos disk itself. I guess there was enough discussion about the disk and it’s symbols to include it.

[1] https://en.m.wikipedia.org/wiki/Phaistos_Disc


The Phaistos disk sort of makes sense to me since, single use or not, it's (probably) some sort of script. I'm not really any kind of expert in ancient scripts or Unicodery and this was a while back but the gist I got from the docs was something along the lines of 'no one-time/decorative uses' and the eye thing looked like it was exactly that. The submission was from an academic in Slavic studies of some sort, I thought about emailing them to ask but couldn't really come up with a way to phrase what amounts to some version of 'did you mess up Unicode or what?' in a non-dickish way.


I guess you could argue that a single discovered extant usage is not single use, it's quite possible that there will one day be a reconstruction of the script (maybe with some more samples being discovered).


You mean the eye thing? That's from Cyrillic, a baby of a script (as scripts go) that's still in wide use today. The o-as-eyes is a decorative flourish, a little visual pun - it's still just an o. An analogous thing would be a medieval Latin parchment of, let's say, the Lord's Prayer and it opened with a bigass P with vines, a gargoyle dancing to a cat shredding on the lute and a tiny caricature of the scribe's dad. If someone found that, we probably won't end up with ʟᴀᴛɪɴ ᴅᴀᴅ ᴊᴏᴋᴇ ᴘ in Unicode.


I think wisty was saying you could make the argument that the glyphs on the Phaistos disk were also used elsewhere, but we just don't have any samples.

There's no way to know one way or the other.


Oh, sure, if it's about the Phaistos stuff - it sounds reasonable to have them in Unicode to me but much more importantly, I'm oversimplifying/butchering/misremembering whatever the actual Unicode rules are. You're better off just assuming I'm wrong about their details in important ways.


Well, it seems a suitable one character representation for "big brother".

Maybe it'll become in-vogue over the next few years? ;)


"big brꙮther"


On the other hand ꙮ has found a new use as "ornament". I sometimes spot it in pagination markers etc.


See, that’s the kind of thing that’s absolutely fascinating. Some scribe thought they were being clever a few centuries ago, and the glyph will now live on forever in our modern equivalent of a collective myth known as “the Unicode standard”, with people discovering new uses for it for generations to come.


It's a nice little doodleglyph, no doubt. I never thought I'd be virtually pointing at an internet stranger and being all 'yes, that's also one of my favourite weird things in Unicode!', though.


> One of my favorite Unicode oddities is in the cyrillic block. Some books used to use the letter "ꙩ" when writing the word eye (ꙩко).

Interesting. That "ꙩко" looks phonetically (if that's the right word, I'm not well up on linguistics (if that's the right word again, ha ha)) a bit like the Hindi word "aankh" for "eye". The "n" sound in "aankh" is emphasized less, for lack of a better term. Actually, in Hindi, it is shown as a dot on top of one of the other letters, to show that.

Also reminded by this, via George Borrow's novel Lavengro[1] (a story about gypsies), that the gypsy and Hindi words for "nose" are similar, "nak".

One theory is that the gypsies (Roma(ni)[2]) migrated from northwestern parts of India to other parts of the world, such as North Africa and Europe.

Aankh (pronounced almost like aak) and nak, get it? :)

[1] https://en.m.wikipedia.org/wiki/Lavengro

[2] https://en.m.wikipedia.org/wiki/Romani_people


Most modern Indo-European language words for 'eye' share a common origin, this predates the Romani and their migrations by a long stretch. Here's an eye:

https://www.etymonline.com/word/*okw-?ref=etymonline_crossre...

Nose is similar:

https://www.etymonline.com/word/*nas-


Romani indeed is much closer to Hindi than other Indo European languages, but the words “eye” and “nose” are not really that good indicators.

Much better indicators are the grammar, the pronunciation, etc.


> The "n" sound in "aankh" is emphasized less, for lack of a better term.

Sometimes an original nasal consonant will reduce to a vowel that remembers the original consonant only by releasing air through the nose. (Where ordinarily the air would come out of the mouth.)

https://en.wikipedia.org/wiki/Nasal_vowel

This is a big thing in Portuguese and French. (At this point, the consonants are long since gone and the nasal vowel is correct Portuguese/French. But the change would have originated in people speaking something closer to Latin, which doesn't use nasal vowels, and being "careless" with their pronunciation.)

Is this what you're talking about?


Elision of -m and nasalisation and/or lengthening of the preceding vowel was already happening in classical latin, including the one spoken by ruling elites and is attested through poetic metric and other sources. See Classical Latin. W. Sydney Allen, in Vox Latina 30–31 Also youtuber ScorpioMartianus has invested some time into training himself into using reconstructed pronunciation and has talked extensively about that, see for example https://youtu.be/psYM-LvBplw


Granted; I spoke much too broadly. According to the classical sources, -m fully disappears when followed by a vowel. (Though same-word intervocalic -m- does not.) Poetic meter backs this claim up robustly.

It's actually a little bit weirder than that; the vowel before -m also disappears. But it's certainly plausible for some nasalization to remain anyway.

> and/or lengthening of the preceding vowel

You're referring to -ns- / -nf-? You're also right there. As far as I'm aware, this doesn't happen for -nd- / -nt-, though.

> Also youtuber ScorpioMartianus has invested some time into training himself into using reconstructed pronunciation and has talked extensively about that

While that sounds like a cool project, I don't think it necessarily has a lot to tell us about the historical pronunciation. I think you could develop a pronunciation system that matched nearly every documented feature of a dead language while failing to match a large number of undocumented features.


> While that sounds like a cool project, I don't think it necessarily has a lot to tell us about the historical pronunciation

You could say the same about pronunciation research published in linguistic journals. Let me use an analogy:

Imagine looking at the source code of a game. It's technically possible for a reader to technically understand what the program is doing and understand what the game is about, how it works, it's rules and goals.

However, if you pass the sources through a compiler (whose behaviour you also can well understand) what you end up with is a game you can run and experience.

Reconstructed pronounciations are a bit like that. You get to "experience" rules that are otherwise coded in an abstract language. The effort of translating those rules into something you experience actually requires a lot of effort and expertise. You can in theory become a "compiler" and learn how to do it yourself (aloud or in your head) but it's hard; what's wrong with outsourcing it?


> Let me use an analogy:

> Imagine looking at the source code of a game. It's technically possible for a reader to technically understand what the program is doing

This is already well beyond what's possible for a dead language. It's not even possible for living languages, although in that case we can draw empiric conclusions.

I've been interested for a long time in the question of how we can determine how a language divides up the space of possible sounds. For example, English [θ] (the sound at the beginning of "thick") is perceived by Mandarin speakers as being the sound [s] (as in "sick"). It is perceived by Cantonese speakers as being [f] (as in "fickle").

The sounds [s] and [f] are both phonemic in both Mandarin and Cantonese. But something about the phonology of each pushes the sound [θ] into one category or the other. The choice is not arbitrary; it is quite consistent across speakers of each language.

To the best of my knowledge, we have no way to answer the question "how would language X categorize sound Y?" other than experimentation, which is impossible with a dead language. But it is a fact about the language, and in principle the question can be answered solely by looking at the pronunciation of sounds within the language -- in the ordinary course of events, a Chinese speaker would go their entire life without being exposed to the sound [θ], and yet they would largely agree with each other on what the sound was if they did hear it.

I say that this categorization question draws upon rules of pronunciation which we don't presently have a good idea of how to describe or characterize at all.

So I say reenactment of a dead language is an interesting project, but you're inevitably going to make choices that are wildly different from the language as it existed in the past. Pronunciation reconstruction is on much firmer ground -- and it gets there by not addressing most questions. But a reenactment cannot avoid addressing every possibility, and it's going to get most of them wrong.


> It's not even possible for living languages

YMMV. I once watched a short video by an accent coach teaching how to make an Irish accent, a Scottish accent, an Australian accent etc. He talked about place of articulation and made pretty decent (although clearly not native) approximations of the pronounciations. I found his attempts at actively voicing things out quite helpful. I'm fully aware this is just an approximation, but in a way I found that teacher to be more effective at conveying what makes a given accent peculiar, more than what just listening to a native speaker would. Probably it all depends on what you're interested in.


Interesting.

>Is this what you're talking about?

I'm not quite sure, since I don't know much about phonetics / linguistics, as I said above.

Something like this, the French sound from near the top of your link above:

https://upload.wikimedia.org/wikipedia/commons/0/0e/Fr-en.og...

But that nasal sound is not exactly the same as the nasal sound in aankh (at least to my untrained ear).

Can't describe it better than that, sorry.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: