Hacker News new | past | comments | ask | show | jobs | submit login
There are good reasons for saying hello. (esmerel.com)
214 points by indiejade on Feb 22, 2010 | hide | past | favorite | 37 comments



Wait, people would just call somewhere and say the extension? That sounds awfully rude to me. Wouldn't you normally say, "Hi, can I have extension 432, please?"

It does the same thing the article says, but you sound like a decent human being.


If you got a different response, like "Hi, XYZ Corp, what extension would you like?", it might be politeish.

Of course anyone being quite that manual has probably now been replaced by a PBX.


What it reminded me of is when, after waiting on hold forever, the CSR says something like: "Thanks for calling ABC Company. What's your account number?" or "What is your phone number?" Having the first "words" a customer utters be numbers is probably not the best idea. I wonder how many hours of waiting on hold could be eliminated if more companies understood this phenomena.

I also wonder if the full word "hello," would be necessary, since the word "hi," makes the similar cavity-sound as the spoken number "five"?


With something like Twilio I don't think you have any excuse for this part of the interaction: play a recorded greeting saying "Hiya, for fastest service, type in your account number . If you don't know your account number, hit 0 and one of our operators can help you look it up." If you get their account number, pick the appropriate CSR, bring up the details directly on their screen, forward the call and drop them straight into "Thanks for calling FooCorp. This is Melinda. To whom am I speaking please?" Hears customer name, a necessary formality because many accounts have multiple people in charge of them -- the most common situation in B2C is husband and wife. "Thanks for calling Mr. Smith. What can I help you with?" If Mr. Smith says his last order didn't ship yet, Melinda is a single keystroke away from pulling it up.

Ooh, the possibilities with Twilio make me giddy. If I can only productize one of them... (Crikey, the possibilities for per-company customization are endless. Do customers ever call you? You should be using Twilio.)


How hard would it be to manufacture a little device that plugged into the phone line (or just listened with a built-in microphone) and displayed the digits/dial-tones coming over the line on a seven-segment display? Then the CSR could just ask you to dial your information in, then read it and use it as normal.


I've never encountered an operator who takes extension numbers rather than names, but I'd say, for example: "Mark Thompson, please" without thinking it's rude (as long as I didn't know the operator). You gotta "make the nice" somewhere though :-)


I fully support being polite, but there's a lot more to vowels than relative pitch. When speaking a vowel, the geometry of your mouth creates multiple resonant frequencies. The ratio among those frequencies is far more important than the ratio of the dominant frequency of one vowel to the dominant frequency of the next.

Consider a couple of examples:

1. A trombone can't say "I Owe You", despite the fact that it can produce any dominant frequency, and that phrase contains only vowels.

2. Voice "Ahhhhhhhhhhhh", going from a high to a low pitch (I imagine the doppler effect of a cartoon screaming as he rides by on a cart), and then voice "Ohhhhhhhhhhhhhh" in the same fashion. At every point, they sound different. You can't take a clip from an arbitrary pitch range of an Ah and convince someone its an Oh.

3. Shape your lips like you would for oo as in Moon, Noon, or Spoon, and try make the ea vowel as in Squeak just by changing your pitch. It won't happen. You'll get the German ö, but not the English ea.


Actually, #1 isn't quite true. A trombone can be made to sound like speech in a couple of ways.

First, with a plunger mute. The adults in the Peanuts cartoons were "voiced" by a trombone with a plunger mute, which gives the impression of a voice without necessarily sounding like it's saying anything in particular. (http://en.wikipedia.org/wiki/Peanuts#Television_and_film_pro...)

Second, it's actually possible to make it sound slightly like specific words by altering the shape and size of your vocal cavity while playing a note. When this is combined with slight glissandos, it can be made to sound a lot like speech. In fact, there is a piece for solo trombone called "General Speech" which uses this technique to recreate General MacArthur's farewell speech at West Point on the trombone. You can read about it and hear it at http://artofthestates.org/cgi-bin/piece.pl?pid=11

To be fair, neither of these techniques produces anything you'd be able to understand in a conversation, but the point is that a sufficiently skilled trombonist could definitely say something that could be recognized as "I Owe You" on the instrument.


People who don't know the German ö may very well mistake it for an ineptly-said English ea, however. Just as many Americans interpret the Canadian-raising pronunciation of "about" as "aboot", even though it's a different sound entirely.


"there's a lot more to vowels than relative pitch."

I agree with everything else you said, but your own words seem to bring us to the conclusion that pitch has nothing at all to do with distinguishing vowels (in languages in which pitch alone cannot be used to distinguish words).


"It won't happen. You'll get the German ö, but not the English ea."

Cool! Now I know how to pronounce a German ö!


And also the Turkish ö. But not `coöperate' from the New Yorker.


Are you sure this isn't a troll? It's so easily disprovable.

Here's a sound clip of an adult male speaking, followed by a young boy.

Do you honestly have trouble understanding the number the boy says?

http://soi.kd6.us/wp-content/uploads/2010/02/sound.mp3


Bingo. I'm surprised nobody else has questioned whether the purported fact in the article is even true. I can't relate at all to not understanding the first few words someone says.

(Although I doubt it's an intentional troll.)


I did, for what it's worth. I was able to make out "five" only by replaying the sound a couple times in my head.


On a related note, the Oatmeal just posted 10 reasons to avoid talking on the phone:

http://theoatmeal.com/comics/phone


I always noticed the awkward goodbye, in other people, not so much myself because of lack of phone conversations really. But it seems people have to make an excuse for every time there hanging up rather than just saying im going to go now bye or something.


this is particularly true (in my limited experience) if you're not a native speaker - saying something like "hello i'd like to talk to extension 123" gives people a chance to figure out your accent. also, like smiling, saying something friendly encourages people to make an effort to understand you.


I wonder if attention plays a part. I have noticed that people can interpret what I say if I start with a grunt or other vocal sound, but if I start off with the thing I want to say immediately, I have to start over. I realize that it can be explained by this "human brain interprets your two syllables" idea, but if a mere grunt also does the trick, this suggests to me that attention plays at least as much an important part.


To me too, it seems attention is of more importance here. It seems the author has a degree in linguistics (http://www.esmerel.com/wagons/ann/), so I won't dismiss this theory easily, but the lack of any citations or references in the submitted page looks suspicious.


Very interesting. I think phone skills are something a lot of people overlook.

I don't have the resources to do a study on this, but another tip is to smile when you talk on the phone. It might just be because my voice is deeper (making something like a smiling face lets you hit higher notes when singing too), but it seems like people understand me faster on the phone when I do this.


Does any speech recognition software use vowel normalization?


Yes. See: "VTLN". Also, the audio feature transformations are adapted during recognition in most cases. For instance, speaker independent recognizers generally do runtime adaptation on a per conversation basis. Speaker dependent ones generally continually adapt to the user using the same techniques.


Are more or fewer syllables needed in various languages? Does it matter whether they are tonal languages?


The word "the" serves a similar purpose, to reset the vocal tract so the next word can be resolved more easily than if it was following an arbitrary word.


[citation needed]



Sounds like a good practical explanation why in some cultures (French, for example), it's practically mandatory to say the equivalent of "Hello" and other pleasantries before getting to the actual content of the conversation.


I'd like to see this theory put to the test. Get a number of differently-sized individuals record "see" or "saw" and test the ability of native speakers to identify which they heard.


I upvoted this because I would love to see more linguistics-related posts come through HN.


I like this little esmerel.com site ... it's quaint and interesting. Thanks.


nevermind the fact that saying 'hello' is considered polite... But I still like this article :)


I honestly can't believe people write posts about this.


I find it interesting that human voice communication seems to have a kind of built-in handshake to synchronize expectations of vowel sounds etc. It makes sense, but it hadn't occurred to me before now, and is probably part of the reason speech to text is so difficult.


I've known a few people who don't get the "conversation is ending" inflection and can't/don't express it. It leads to some awkward pauses on the phone and me having to ask continuously if there is anything else. May be the other person doesn't want to hang up first, but I have no trouble hanging up first if there's an indication that the conversation is over and the other person isn't just stalling because they have more to say.


In my experience I know a couple people who don't understand how the conversation starting and ending works. Usually a person's tone and speaking pace has a distinct pattern at the beginning of the conversation, and also a distinct pattern at the end that shows that the conversation is ending.

These people just start off talking like it is the middle of the conversation. When they are done they just walk right off abruptly. It is interesting the way we come to expect certain speaking patterns based on how most people communicate.


> It is interesting the way we come to expect certain speaking patterns based on how most people communicate.

Isn't this what language (and other practical conventions) are all about?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: