Hacker News new | past | comments | ask | show | jobs | submit login

The article is looking at efficiency in terms of time to transmit a given message. For that you want to assign the shorter codes to the symbols with the highest frequency. Morse did a decent job of that, except that assigning '---' to 'O' seems way off, since '---' is in the top 5 for longest code length, whereas 'O' is one of the 5 more frequent symbols.

I wonder what would change, if anything, if instead we considered things from the receiving side, and put a bound on the acceptable error rate? That '---' for 'O' really stands out when listening to code. The only other letter that was a '---' in it is J ('.---'), which only occurs about 2% as often as 'O'. Maybe 'O' being so distinct and easy to hear, and frequent enough that you will have an 'O' every 10 or so characters, helps keep the listener synchronized?

Early Morse code was sent fully by hand, and so the timing would not be precise. The timing is supposed to be, in units of the length of a dot: 1 for a dot, 3 for a dash, 1 for the space between adjacent dots and dashes within a character, 3 for the space between characters in a word, and 7 for the space between words. A good, experienced operator would hit that timing very accurately, but less experienced operators could be quite a bit off.

Someone whose timing is off might shorten the gap between characters enough that it might run dashes from the end of one character and the start of the next together. For example, in the word 'awkward', the 'wk' sequence becomes '.-- -.-' and if the person did not give as big a gap as they should between the words, you could run the trailing '--' from the 'w' and the leading '-' from the 'k' together giving a '---', but even in that situation I don't think it would sound like an 'O', because with an 'O' you go into it trying make 3 evenly spaced things, and we are good at that. We might get the spacing off, but we'll get it off the same way uniformly throughout the 'O'. An accidental 'O' from running two things together won't have that uniformity, and so I think it would stand out.

In other words, I'm guessing that the apparently anomalous assignment of a seemly too long code to 'O' actually servers to make communication more accurate in the presence of inexperienced senders and receivers.




On the other hand, the letter O in the older American Morse was relatively short because it contained an "long" internal gap: "dit-dit" (as opposed to I being "didit" and the "word" EE being "dit, dit"). If the regular gap is the length of a dot, and the inter letter gap is the length of three dots, the length of the gap in O was two dots.

International Morse eliminated the "long" internal gap (according to Wikipedia [1], this had an advantage on the first long undersea cables), so O had to be re-encoded. '---' ("dahdahdah") was the only three-element code not already being used for a letter. (It happens to be the number 5 in American Morse.)

When I was a Novice-class ham many years ago, I found that older hams would sometimes send the pro sign C (in the sense of "confirm" or "yes") as didit-dit instead of dahdidahdit. I never really understood why; I just went along with it. Turns out didit-dit is C in American Morse.

[1] https://en.wikipedia.org/wiki/American_Morse_code


"Early Morse code was sent fully by hand"

The earliest Morse telegraphy systems were actually built around paper tape. Operators figured out that they could transcribe directly from the sound of the machine, obviating the need for the costly and difficult to deploy equipment. The skill developed from there with equipment evolving to accommodate human operators in real time.

"Someone whose timing is off might shorten the gap between characters enough that it might run dashes from the end of one character and the start of the next together."

Variations between operators exist and specific operators can be identified by their "fist" alone, but the variations are consistent for a given sender so receivers adapt relatively easily. Noise is the bigger problem. Energy from a lighting strike a thousand miles away ricocheting around the ionosphere can wipe out lots of dits and dahs and receivers either correct incoherent transmissions when they can or request something be repeated.


Early Morse code was sent fully by hand, and so the timing would not be precise.

Correct. My grandmother was an operator during WWII in Australia, communicating with various places northward such as Papua New Guinea. She said that she could easily recognize individual operators via their timings.


This has actually been proposed as the explanation for the distribution of word lengths seen across natural languages: http://www.pnas.org/content/108/9/3526.long.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: