Wait, people would just call somewhere and say the extension? That sounds awfully rude to me. Wouldn't you normally say, "Hi, can I have extension 432, please?"
It does the same thing the article says, but you sound like a decent human being.
What it reminded me of is when, after waiting on hold forever, the CSR says something like: "Thanks for calling ABC Company. What's your account number?" or "What is your phone number?" Having the first "words" a customer utters be numbers is probably not the best idea. I wonder how many hours of waiting on hold could be eliminated if more companies understood this phenomena.
I also wonder if the full word "hello," would be necessary, since the word "hi," makes the similar cavity-sound as the spoken number "five"?
With something like Twilio I don't think you have any excuse for this part of the interaction: play a recorded greeting saying "Hiya, for fastest service, type in your account number . If you don't know your account number, hit 0 and one of our operators can help you look it up." If you get their account number, pick the appropriate CSR, bring up the details directly on their screen, forward the call and drop them straight into "Thanks for calling FooCorp. This is Melinda. To whom am I speaking please?" Hears customer name, a necessary formality because many accounts have multiple people in charge of them -- the most common situation in B2C is husband and wife. "Thanks for calling Mr. Smith. What can I help you with?" If Mr. Smith says his last order didn't ship yet, Melinda is a single keystroke away from pulling it up.
Ooh, the possibilities with Twilio make me giddy. If I can only productize one of them... (Crikey, the possibilities for per-company customization are endless. Do customers ever call you? You should be using Twilio.)
How hard would it be to manufacture a little device that plugged into the phone line (or just listened with a built-in microphone) and displayed the digits/dial-tones coming over the line on a seven-segment display? Then the CSR could just ask you to dial your information in, then read it and use it as normal.
I've never encountered an operator who takes extension numbers rather than names, but I'd say, for example: "Mark Thompson, please" without thinking it's rude (as long as I didn't know the operator). You gotta "make the nice" somewhere though :-)
I fully support being polite, but there's a lot more to vowels than relative pitch. When speaking a vowel, the geometry of your mouth creates multiple resonant frequencies. The ratio among those frequencies is far more important than the ratio of the dominant frequency of one vowel to the dominant frequency of the next.
Consider a couple of examples:
1. A trombone can't say "I Owe You", despite the fact that it can produce any dominant frequency, and that phrase contains only vowels.
2. Voice "Ahhhhhhhhhhhh", going from a high to a low pitch (I imagine the doppler effect of a cartoon screaming as he rides by on a cart), and then voice "Ohhhhhhhhhhhhhh" in the same fashion. At every point, they sound different. You can't take a clip from an arbitrary pitch range of an Ah and convince someone its an Oh.
3. Shape your lips like you would for oo as in Moon, Noon, or Spoon, and try make the ea vowel as in Squeak just by changing your pitch. It won't happen. You'll get the German ö, but not the English ea.
Actually, #1 isn't quite true. A trombone can be made to sound like speech in a couple of ways.
First, with a plunger mute. The adults in the Peanuts cartoons were "voiced" by a trombone with a plunger mute, which gives the impression of a voice without necessarily sounding like it's saying anything in particular. (http://en.wikipedia.org/wiki/Peanuts#Television_and_film_pro...)
Second, it's actually possible to make it sound slightly like specific words by altering the shape and size of your vocal cavity while playing a note. When this is combined with slight glissandos, it can be made to sound a lot like speech. In fact, there is a piece for solo trombone called "General Speech" which uses this technique to recreate General MacArthur's farewell speech at West Point on the trombone. You can read about it and hear it at http://artofthestates.org/cgi-bin/piece.pl?pid=11
To be fair, neither of these techniques produces anything you'd be able to understand in a conversation, but the point is that a sufficiently skilled trombonist could definitely say something that could be recognized as "I Owe You" on the instrument.
People who don't know the German ö may very well mistake it for an ineptly-said English ea, however. Just as many Americans interpret the Canadian-raising pronunciation of "about" as "aboot", even though it's a different sound entirely.
"there's a lot more to vowels than relative pitch."
I agree with everything else you said, but your own words seem to bring us to the conclusion that pitch has nothing at all to do with distinguishing vowels (in languages in which pitch alone cannot be used to distinguish words).
Bingo. I'm surprised nobody else has questioned whether the purported fact in the article is even true. I can't relate at all to not understanding the first few words someone says.
I always noticed the awkward goodbye, in other people, not so much myself because of lack of phone conversations really.
But it seems people have to make an excuse for every time there hanging up rather than just saying im going to go now bye or something.
this is particularly true (in my limited experience) if you're not a native speaker - saying something like "hello i'd like to talk to extension 123" gives people a chance to figure out your accent. also, like smiling, saying something friendly encourages people to make an effort to understand you.
I wonder if attention plays a part. I have noticed that people can interpret what I say if I start with a grunt or other vocal sound, but if I start off with the thing I want to say immediately, I have to start over. I realize that it can be explained by this "human brain interprets your two syllables" idea, but if a mere grunt also does the trick, this suggests to me that attention plays at least as much an important part.
To me too, it seems attention is of more importance here. It seems the author has a degree in linguistics (http://www.esmerel.com/wagons/ann/), so I won't dismiss this theory easily, but the lack of any citations or references in the submitted page looks suspicious.
Very interesting. I think phone skills are something a lot of people overlook.
I don't have the resources to do a study on this, but another tip is to smile when you talk on the phone. It might just be because my voice is deeper (making something like a smiling face lets you hit higher notes when singing too), but it seems like people understand me faster on the phone when I do this.
Yes. See: "VTLN". Also, the audio feature transformations are adapted during recognition in most cases. For instance, speaker independent recognizers generally do runtime adaptation on a per conversation basis. Speaker dependent ones generally continually adapt to the user using the same techniques.
The word "the" serves a similar purpose, to reset the vocal tract so the next word can be resolved more easily than if it was following an arbitrary word.
Sounds like a good practical explanation why in some cultures (French, for example), it's practically mandatory to say the equivalent of "Hello" and other pleasantries before getting to the actual content of the conversation.
I'd like to see this theory put to the test. Get a number of differently-sized individuals record "see" or "saw" and test the ability of native speakers to identify which they heard.
I find it interesting that human voice communication seems to have a kind of built-in handshake to synchronize expectations of vowel sounds etc. It makes sense, but it hadn't occurred to me before now, and is probably part of the reason speech to text is so difficult.
I've known a few people who don't get the "conversation is ending" inflection and can't/don't express it. It leads to some awkward pauses on the phone and me having to ask continuously if there is anything else. May be the other person doesn't want to hang up first, but I have no trouble hanging up first if there's an indication that the conversation is over and the other person isn't just stalling because they have more to say.
In my experience I know a couple people who don't understand how the conversation starting and ending works. Usually a person's tone and speaking pace has a distinct pattern at the beginning of the conversation, and also a distinct pattern at the end that shows that the conversation is ending.
These people just start off talking like it is the middle of the conversation. When they are done they just walk right off abruptly. It is interesting the way we come to expect certain speaking patterns based on how most people communicate.
It does the same thing the article says, but you sound like a decent human being.