Even though Russian does drop the "to be" in present tense for all pronouns, oth...

mananaysiempre · on March 4, 2022

I also think this is because when you're learning Russian as a non-native speaker, the courses focus mainly on Cyrillic (naturally so) so you never really learn [the romanization conventions].

There’s a number of romanization standards for Cyrillic, and which one is the most intuitive might be language-dependent (e.g. what Russian writes as a stressed ‹и› Ukrainian writes as ‹і›, which post-1918 Russian doesn’t use; what Ukrainian writes as ‹и› is closer but not identical to what Russian denotes ‹ы›, which Ukrainian doesn’t use).

Wikipedia has a good summary table[1] of some of the formal standards, but these don’t cover some of the vernacular usage, so to say. The most frequent quirk is probably writing ‹щ› as ‹sch›, as you do, when standards insist on ‹shch› (my guess is because the Belarusian counterpart of ‹щ› is ‹шч›, which would also be written ‹shch›); most confusing is perhaps writing the masculine adjectival ending ‹-ий›, ‹-ый› as ‹-y› or rarely ‹-yy› (writing ‹Navalny› for the surname ‹Навальный› or ‹Zelenskyy› for the surname ‹Зеленський›, maybe because old 19th-century German-inspired romanizations usually wrote it as ‹-i›, merging the ending with its Polish counterpart). I have to say I’ve never seen ‹zsh› instead for ‹zh› for ‹ж›, though.

Empirically, I’ve also found that French people struggle with pronouncing an English-inspired transliteration (‹sh› for ‹ш›, ‹ch› for ‹ч›), but have no problem with finding at least a pronounceable fallback using a Czech/Serbo-Croatian-inspired one (‹š› for ‹ш›, ‹č› for ‹ч›), so diacritics might indeed be underrated here.

[1] https://en.wikipedia.org/wiki/Romanization_of_Russian#Transl...

dhosek · on March 4, 2022

I'd for a long time puzzled over the mysterious \t accent in TeX. Why did DEK implement this obscure diacritical and leave out the more useful Ogonek?¹ It was only when I stumbled across it (somewhat garbled) in an online card catalog entry that I discovered that it was used in some Russian transliteration schemes to represent digraphs like t͡s for ц.

⸻

1. I suspect the other reason was that implementing the tie was technically simple since, by allowing it to extend pass its spacing width, it can be treated like any other above-character accent. Ogonek, on the other hand, unlike a cedilla or the dot-under diacritic, requires positioning based on the letter that it's attached to so can't be programmed as easily as a floating diacritic mark.

mananaysiempre · on March 4, 2022

I see you didn’t spend a week sick in bed with no fresh reading material except for a copy of the TeXbook :) Exercises 9.4 and 9.5 from the chunk of exercises on diacritics there mention ‹Akademii͡a› for ‹Академия› (usually ‹-ija› or ‹-iya›) and ‹I͡urʼev› for ‹Юрьев› (usually ‹Ju-› or ‹Yu-›), which I remember (sans the numbers of course) exactly because I had never before seen that romanization[1] and thought it weird. But apparently the Library of Congress does use it, and if you can get hold of official English translations/selections of Soviet physics or mathematics journals from the 70s and 80s you’ll see the authors’ names spelt according to it as well.

Note that in modern times, you’re supposed to use the tie to spell affricates and such in IPA as well, like in ‹t͡ʃ› and ‹d͡ʒ› and, yes, ‹t͡s›, even though nobody does as far as I’ve seen.

[1] https://en.wikipedia.org/wiki/ALA-LC_romanization_for_Russia...

dhosek · on March 5, 2022

Alas, I read the TeXbook originally in 1986 and while I've dipped into it a lot since then, I've not re-read it in its entirety since that first read. I'm the one responsible for adding a section about the ALA-LC romanization in Tie (typography) on Wikipedia.

ideaoverload · on March 4, 2022

Polish example seems to be some mix up of forms:

  Jestem z Polski == I am from Poland
  Jesteś z Polski == You are from Poland
  Jestem Polakiem == I am Polish
  Jesteś Polakiem == You are Polish

Both have very close meaning but 3&4 seem to be equivalent of examples in other languages given above.

Source: I am Polish

smcl · on March 4, 2022

In Czech it's "jsem Rus" and "jsi Rus", the latter can have a "ty" but it's not necessary and I think Slovak is the same.

mongrelion · on March 5, 2022

Which makes sense. The verb has already been declined and gives away which person it's addressing. Same in Spanish: Yo soy (you can drop the yo because soy only applies for yo). Same for other pronouns.

cleancoder0 · on March 4, 2022

Yep, Croatian/Serbian have a phonemic orthography. Quite easy to write given that sounds map directly to a letter.

medo-bear · on March 4, 2022

the maxim is to "speak as you write and write as you speak". however there are variations to adoptions of the rule. croatians will tend write foreign words as written in that language, while in serbia you always follow this rule. so for example in serbia it is Majkl Džordan while in croatia its just Michael Jordan

mongrelion · on March 5, 2022

I always found this funny. In Russian "second hand" is spelled секонд хэнд, which I find hilarious.

msrenee · on March 5, 2022

Thanks for all the info. I definitely underestimated the number of languages that don't use Cyrillic. I also had it in my head that dropping "to be" in the present tense was more common. Do you know anything about the history of that as far as where, linguistically, it started being dropped? Like which Slavic language groups drop the verb and which don't? I don't know if that makes sense.

mongrelion · on March 5, 2022

I truly have no idea. My knowledge in Russian is very limited. Maybe someone else can chip in

cesnja · on March 4, 2022

On the other hand we commonly omit the "I"/"Jaz" in Slovenian - there's more than enough info in the conjugated verb.

mongrelion · on March 5, 2022

Same in Spanish