Google Turns Your Android Phone Into An On-The-Fly Conversation Interpreter

ghshephard · on Jan 12, 2011

This is almost moving Kurzweil's prediction from "mostly correct" to "correct".

"Early 2000s

    * Translating telephones allow people to speak to 
      each other in different languages.
    * Machines designed to transcribe speech into 
      computer text allow deaf people to understand 
      spoken words."

Every time he notches another victory, I pay closer and closer attention to his other guesses that are 10 - 30 years out.

fergal_reid · on Jan 12, 2011

I don't buy that Kurzweil prediction accuracy stuff.

There's just so much wiggle room when you start allowing categorisations like 'mostly correct', and the sort of equivocation that seems to be going on. I thought his recent evaluation of predictions was weak.

First off, this particular technology could probably have been built, very badly, in the early 90s. It would have had accuracy too low to make it useful. Its cool that google are doing this now, certainly, but the questions on accuracy and speed need to be answered. I'm sure they'll do a good job, but my point is that there is a big difference than when the technology is first prototyped in a lab, in some form, and when its mature enough for wide spread consumption. The predictions really need to say which they are referring to to be meaningful - a big problem with the recent evaluation.

Further, if your thesis is that technological capacity grows exponentially, like moores law, then 8/10 years wrong is pretty wrong.

Finally, I'll just say, that the charts of progress and development in the book 'The Singlularity Is Near' are hard to take seriously. Subjective technological milestones are arbitrarily chosen, sometimes over millennial timespans. These are plotted log-log, lines are fitted and inferences drawn that make predictions on for the next 30 years. What's the margin of error to a methodology like that?

I'm not disagreeing with any particular thesis; his writing certain stirs healthy debate on important ideas, and its certainly worth pointing out to a whole lot of people that technological growth is non-linear.

But beyond that, I'm not paying too much attention to the future predictions.

ghshephard · on Jan 12, 2011

I'm a huge kurzweil critic as well, and, in fact, believe he is overly generous sometimes in his "Mostly Correct" versus "Correct" category.

But, seriously - you've got to give the guy credit for his "* Translating telephones allow people to speak to each other in different languages."

That's a pretty far-out prediction for when he made it, and, the new Droid Apps are doing precisely that.

The voice recognition stuff that Dragon does freaks me out so much that I actually don't use them - It weirds me out to see 5-10 minutes of me talking coming out with only one or two typos. It's an uncanny valley experience. I can only imagine what it's going to be like when _translation_ experiences the same increase in performance and accuracy.

BTW - the entire point of Kurzweil's writing, and "singularity" thesis is precisely that the last 500+ years of technological growth are all building on each other, and a 20% growth next year is incomparable to what 20% growth was 100 years ago, presuming we had year over year growth of 20% in terms of our information processing capability.

We can argue if it's 30%/year, 20% year, or 10% year - but the reality is that as long as it's 10% plus (which I think every resident of HN would agree is a conservative predictor for future technological growth for the next 20 years), the curve gets freaky very soon.

Kurzweil isn't interesting in terms of the precision and absolute accuracy of his predictions - which I'll agree with you, he is overly generous - but in terms of the "class" of advances he is predicting.

Basically, anything that can be translated (pardon the pun. :-) into a "Information Processing Domain" - is going to be radically transformed in terms of performance, personalization, and miniaturization.

He predicted, when it was _not_ obvious - that we would have phones that would be able to translate our conversations, right around now.

So, I find his prediction about things like "nanotechnological blood and organ analysis in 20 years" to be pretty darn interesting.

InclinedPlane · on Jan 12, 2011

"That's a pretty far-out prediction for when he made it."

Did he make his prediction in the 1960s? The original Star Trek series posited the existence of computerized universal translators.

Edit: Text-to-speech and vice-versa technology has been commercially available for decades, reasonably good versions for at least 2 decades. Automated translation systems have been available for decades, with free, online services available for at least the last decade. Talking electronic multi-language dictionaries have been available for years. It doesn't take much creativity to imagine plumbing together existing technologies (speech-to-text in one language, textual automatic translation to another language, and text-to-speech in the second language). Streamlining the process, making it work well, and making it available easily on widely available cell phones is great and amazingly useful but hardly an unpredictable innovation.

Sorry, imagining that it would some day be possible to pipe the output of previously existing application A into previously existing application B (or imagining that previously existing devices will gain slightly incrementally improved input capabilities) is hardly a "far-out" prediction.

ghshephard · on Jan 13, 2011

There's a difference between predicting which decade a technology will be in use by the consumer, and whether it is possible. Almost all technologies are possible (teleportation included). The question is when they will land in yours/my hands.

I'm more impressed with Kurzweill saying: "Early 2000s

    * Translating telephones allow people to speak to
      each other in different languages."

And being off, admittedly, by six or seven years, then someone positing:

"Teleportation is possible" and it becoming practical in 2170.

WalterBright · on Jan 13, 2011

Teleportation will never be possible.

ghshephard · on Jan 13, 2011

http://en.wikipedia.org/wiki/Quantum_teleportation

fergal_reid · on Jan 13, 2011

That right there is exactly the sort of reasoning I object to.

A poorly defined prediction is made: "Teleportation will be possible". Now, everyone who reads this probably imagines 'beam me up, scotty'.

Subsequently, teleportation is defined to include things like 'quantum teleportation', which, while it does have 'teleportation' in the name, does not instantly move mass, from one place to another, and, fundamentally, is not theorised to enable the Star Trek-esque people 'beaming' around the place.

I'm not saying which definition of the word 'teleportation' is best - just that the initial statement, if we allow it to be subsequently satisfied by either of those things, in a way it wasn't intended, is not very interesting.

fergal_reid · on Jan 12, 2011

"you've got to give the guy credit for his translating telephones..."

Was it really so hard to predict in 1990 we'd eventually have automatic translation?

First off, Kurtzweil had a company doing (primitive) speech recognition since the 80s; he was a domain expert.

http://www.kurzweiltech.com/kai.html "I also started Kurzweil Applied Intelligence, Inc. in 1982 with the goal of creating a voice activated word processor."

Given that there was speech recognition tech around, and with simple, poor, machine translation since much earlier, was it really so hard to extrapolate that one day we'd have machine voice translation? Especially if your prediction didn't specify any benchmark accuracy?

What was hard to estimate was when it would happen.

"He predicted, when it was _not_ obvious - that we would have phones that would be able to translate our conversations, right around now."

But what is now? Assessing the accuracy of the prediction, there's a big difference between 'early 2000s', and 2011, especially on an exponential curve.

However, if you make a prediction without any sort of an objective or verifiable benchmark, the timing gets real easy, because its not clear whether you meant

* that the technology could be implemented, or

* had been implemented in a lab, or

* was commonly available.

Nor is it clear, in this specific example, what levels of translation accuracy you are requiring for the prediction to be met.

This provides huge wiggle room. He must realise that, but continues not to provide precisely verifiable predictions.

It is, of course, very hard to assess the obviousness of a prediction, retrospectively, and this is something I'm conscious of - its easy to say its obvious, after the fact.

However, if he was serious about tracking accuracy, he would make clearly defined predictions, and in order to establish their value at the time, would make something like a real money prediction market - or a series of bets with other domain experts - which would enable a retrospective evaluation of how outspoken the prediction was.

As it stands, this hasn't been done in the past, so when one of his predicted technologies suddenly shows up, I remain sceptical about his overall accuracy.

I agree with you that the core thesis - that exponential growth has large consequences, which are as yet poorly understood - is a very interesting and important one. I am cautious though; inferring continued exponential growth, purely from looking at the past, is inductive reasoning; its not something we should take for granted, either.

ghshephard · on Jan 13, 2011

Okay - for all the Kurzweill naysayers - how about going out on a limb, and telling me which of his predictions are ones that we'd "eventually have" (+/- let's say five years or so), and which ones are just plain wrong.

For example - I think a _lot_ of people in 1990 would have (and in fact, they did) say that voice data entry would be the majority method for communication with computers in 2010. Totally and completely wrong. Kurzweil admits that he was way off on that one. (By at least 10 more years, he figures) But, at the time - it really seemed to be much more likely that we would automate voice entry, rather than teaching the entire human species to touch type.

So - go for it - tell me, right now, which of his many predictions are "obviously going to happen." and which are "Totally foolish".

Alternatively, do you have some that he hasn't seen?

What drives me crazy about the Kurzweil Naysayers is that

  A) They Nitpick - overlooking his general correctness. 
    (I include myself in that category, btw. ) 

  B) They do it after the fact - I'd love for them to 
     pick a few of his future predictions and say that 
    (1) They will never happen or (2) They are obviously
        going to happen.

storm · on Jan 13, 2011

You're a "huge Kurzweil critic" one reply back, now you're railing against the sins of his "naysayers"?

ghshephard · on Jan 13, 2011

I simultaneously have issues with Kurzweil's tendency to be overly generous with his "Mostly Correct/Correct" categories (check out my comments in other threads) - yet, at the same time believe it's important to look beyond the nitpicking of minor details. In the _specific_, Kurzweil is overly generous - but I think it's important that we don't lose sight of the fact that the general arc of his predictions, are, in fact, pretty good.

Critic is not necessarily pejorative. A critic can point out the good and bad.

From: http://www.thefreedictionary.com/critic

Critic: One who forms and expresses judgments of the merits, faults, value, or truth of a matter.

fergal_reid · on Jan 13, 2011

"So - go for it - tell me, right now, which of his many predictions are "obviously going to happen." and which are "Totally foolish". "

I'm not claiming to be able to make long term predictions. I'm also not egomaniacal enough to think that anyone will care enough to come back and check my post in 5/10 years :) I'm just saying to accurately evaluate how good his correct predictions were, we would need a benchmark of predictions of other people.

I think there's a lot of value to reading Kurzweils writing, and its entertaining. I just object to people who pick out the things he generously got right, and massively overestimate his success rate, including him. He recently announced an 86% success rate. 1) He shouldnt put sharp numbers quantifying things that have been loosely defined, and sloppily evaluated. 2) He has no-where near that, in my opinion.

I dont want to go through 2 in detail, but I think its fairly clear from reading his prediction summary, or http://us.penguingroup.com/static/packages/us/kurzweil/excer... that while he got a good few things right, he got a lot wrong too.

billpaetzke · on Jan 12, 2011

Predictions made by Ray Kurzweil:

http://en.wikipedia.org/wiki/Predictions_made_by_Ray_Kurzwei...

BRadmin · on Jan 12, 2011

Screenings for a documentary about him, Transcendent Man, were finally posted yesterday:

http://transcendentman.com/

abecedarius · on Jan 12, 2011

How well does this system handle your second bullet point (transcription for conversation with the deaf)? I have a use for that but, not having an Android phone yet, I can't just try it out.

michaelbuckbee · on Jan 12, 2011

Dragon Dictate is available for both iPhone and Android which would seem sufficient for a face to face conversation (or at least better than the alternative).

Splines · on Jan 12, 2011

Probably just as well (as the deaf person wouldn't have to interpret the text-to-speech engine).

I'd wager that this project was built on the back of GOOG-411.

nazgulnarsil · on Jan 12, 2011

I was forced to revise my estimate in kurzweil's direction of 2040 for AI/whole brain emulation after seeing AIXI-MC, and the memristor brain project being funded by DARPA. My original estimate was more like 2080.

pyre · on Jan 12, 2011

Interestingly enough a search for 'AIXI-MC' turns up an OKCupid posting as the third result. And it's correct:

  possible instantiations of AIXI-MC or Goedel machines

is listed under 'I spend a lot of time thinking about.' Though I wonder if that's a failure of Google or a failure of the popularity of the term 'AIXI-MC.'

michaelbuckbee · on Jan 12, 2011

What's most interesting to me is that Google seems to be moving forward with a strategy of competing with iOS via services instead of just applications.

Services (like Conversation Mode translate, constantly updated turn by turn GPS driving directions, more deeply integrated Google Talks) give them leverage against the carriers, who have to meet Google guidelines for Android in order to include the flagship apps and represent a really high bar that Apple would have to overcome to compete.

I realize it isn't exactly as cut and dried as services vs applications, but it is certainly a strong move that plays to Google's strengths.

nostrademons · on Jan 13, 2011

I suspect that mobile in general will move towards services instead of applications. There's a lot of data crunching that you can do when you have a data collection device in every person's pockets, but doing that crunching on the client will drain your batteries it to time. I predict that the really interesting mobile apps will have a thin (but native) client that only does data collection & UI, and then the most computationally interesting piece will be hosted in the cloud somewhere.

Andrenid · on Jan 12, 2011

In the last 6-12 months alone i'm really starting to see "the future" that I dreamed of as a kid.

Between this article, http://questvisual.com/, Microsoft Kinect, Nintendo 3DS, Microsoft Surface 2 (first version didn't impress me, was so huge and clunky), Amazon's Kindle 3 (huge fan of HHGTTG, the Kindle basically IS the guide) etc...

As a life-long nerd/geek, it's pretty awe-inspiring.

enjo · on Jan 12, 2011

I'll second this.

The Kinect, in particular, is my invention of the year. I've had it for two months now and the freaking menu in dance central STILL doesn't get old. Just moving through that menu is the single most tactile thing I've ever done when interacting with a machine.

I LOVE it... and I can't wait to see how that technology grows and changes in the coming years.

othello · on Jan 12, 2011

I would also add to that list the Emotiv kit (http://www.emotiv.com), which allows "to create applications that can be controlled by your mind" by making use of the same type of technology that allows paralytics to control robotic arms.

Actually, the technology alone is highly impressive, but that an SDK is available to anyone for $299 is nothing short of mind-boggling.

MichaelApproved · on Jan 13, 2011

$299 for the device + $500 for the sdk.

roschler · on Jan 21, 2011

No. The $500 Developer edition includes both the SDK and the device (headset).

Estragon · on Jan 12, 2011

Actually, a smart phone with offline wikipedia is a much closer approximation to The Guide than the kindle.

Andrenid · on Jan 12, 2011

True, but the fact alone that it can store lots and lots of information, has the ability to obtain new information wirelessly, is "flat, book shaped, with a large screen and lots of buttons underneath the screen", etc... is what sells me on the comparison.

Being able to store Wikipedia offline would definitely clinch the deal though, even if it just stored entries on geographical locations and places, ignoring everything else... or if it had a Wiki app that let you choose topics and "cache" them locally. That would be awesome.

anigbrowl · on Jan 12, 2011

What age are you, out of curiosity?

Andrenid · on Jan 12, 2011

kenjackson · on Jan 12, 2011

Take this and Word Lens like capability -- if I was in junior high, I'd make the argument that there's no need for me to learn a foreign language. In a few years, I can speak and write any language there is!

alextp · on Jan 12, 2011

This might save you some time/money, but it certainly can't bridge the gap between different languages. Keep in mind that "to translate is to betray" http://en.wikipedia.org/wiki/Traduttore,_traditore

throw_away · on Jan 13, 2011

Doug Hofstadter of GEB fame wrote an interesting book on this called Le Ton Beau de Marot (http://en.wikipedia.org/wiki/Le_Ton_Beau_de_Marot) where he argued that 100% translation was pretty much impossible, and machine translation in particular. In that book he translated one poem from French to English thirty-six times, each time capturing some nuance of the original, but showing that no single translation could possibly capture everything.

ximeng · on Jan 12, 2011

Imagine having a real-time interpreted conversation like they do in this Microsoft video:

http://zekeweeks.com/2010/03/03/real-life-babelfish-the-tran...

The delay and imperfections would mean you communicate less than half the speed you would if you spoke the language natively. That's going to get annoying quickly.

kenjackson · on Jan 12, 2011

But after four years of high school Spanish, my communication is probably even worse.

Plus I can take that four years studying Spanish and instead work on improving the translation. :-)

But if I ever leave my shoes in the library, I am ready! "Mi zapatos son en la biblioteca"

lukeschlather · on Jan 12, 2011

Almost ready. You're looking for "Mis zapatos están en la bibliotéca." Or maybe Mi zapatos is fine, the z might negate the s in mis.

kenjackson · on Jan 12, 2011

Dude, it was just four years of Spanish... I can't be expected to speak as well as a toddler!

lemming · on Jan 12, 2011

No, you want "mis zapatos".

lukev · on Jan 12, 2011

Automated translation can get 90% of the way there, and we're very close to this point already. But for that final 10%, you basically need full AI that's capable of actually comprehending the subject matter in order to provide an adequate translation.

I think that's probably more than a couple decades out.

jokermatt999 · on Jan 12, 2011

There's still nuances and subtleties that will be lost in machine translation (I'd assume it would still strong AI to have otherwise), so there is still plenty reason to fully learn a language. However, learning a language at the junior high/high school level of two years or so of minimal study probably can be replaced by machine translation in terms of being able to communicate basic thoughts.

ghshephard · on Jan 12, 2011

Agreed. It will take a while to have "offline" translators available, but, if we can extend Moore's Law and increasing storage densities - then I'll conservatively predict that in 20 years, we'll have a pretty close to flawless babel fish like technology available, and that's just so we have five years to move the "online translation" technology to a local device. There are a number of research projects that are starting to make progress in this area.

2015 We will have universal dictionary lookup capability for most languages, and will have word-word translation in spoken, clearly written (handwriting AI is a long ways off - I'll make no predictions there), read, and listened to language. We will also have the capability to convert between all four (Write down a word in English, have it displayed and spoken out in French)

2020 will be the year that we'll start to see reasonably good translation systems that take into account some amount of nuance beyond word-for-word translation. This will be in a research environment, but will quickly move out of that environment into commercial applications.

2025 will be the year that Translation, as a skill set, will start to be replaced by machines - in particular, subtitling for movies into local languages will be predominantly done by machine system for all but the highest end productions.

2030 will be the year that we can write, read, speak, and understand, anyone in the world in each of our native languages, anywhere, any time. It will also be the year that Language Translation systems will be seen as a reasonable alternative to human translators.

So, as long as you plan on being a translator before 2030 - you should be okay - after that, all bets are off as to whether you have a career.

kenjackson · on Jan 12, 2011

The other thing that is interesting, when speaking about nuance and such, is that a good translator that uses tweets, emails, other translated discussions, is that I can get instantly translation of emerging memes and subtleties in a language.

cageface · on Jan 13, 2011

This just brings you into the uncanny valley of NLP. To say that this is almost as good as native fluency is like saying that current CG animation is almost photo-realistic. Like so many things in technology, closing the gap of the last percent turns out to be as hard as the first 99 put together.

The need for phrasebook-level fluency in another language may be diminished soon by technologies like these though.

loewenskind · on Jan 13, 2011

Language is more than just words. It's culture, a way of thinking, a new point of view. If you're not interested in those things than you almost certainly never needed to learn a foreign language (English will get you by in most places and if you don't care about culture, why worry about the places it wont?).

aw3c2 · on Jan 12, 2011

Learning a foreign language is fun! Hell, I enjoy using English after all those years I was forced to learn it.

spiffworks · on Jan 12, 2011

Just tried it, works only for the English-Spanish pair right now, but the reliability is pretty good considering that they're calling it an Alpha.

gms · on Jan 12, 2011

The real question is how far it will move from alpha, if at all.

pfarrell · on Jan 12, 2011

My hovercraft is full of eels. Sorry, couldn't resist. Natural language processing, could we really see major improvements in our lifetimes? My gut tells me there are so many nuances to the way we work that the best we can get to will have to include many of the inconsistencies we have in understanding each other.

erikstarck · on Jan 12, 2011

The vodka is strong but the meat is rotten.

fhars · on Jan 13, 2011

Except that if you actually travel in parts of the world where you would need this service, international data roaming charges for accessing this service will be so outrageous that it may just be cheaper to hire a professional interpreter to travel with you.

johnyzee · on Jan 13, 2011

Has anyone else been waiting for this 'invention' forever? Between speech recognition and machine translation it seemed to be a solved problem for the longest time.

trurl123 · on Jan 13, 2011

At first google should improve http://translate.google.com to show valid translations.

andrewljohnson · on Jan 12, 2011

Douglas Adams was mostly right. It's just not a fish.

tocomment · on Jan 13, 2011

AnyOne know when this will be available to install? I'm itching to try it out.

pedanticfreak · on Jan 12, 2011

I have nothing meaningful to contribute, but I want to say this is a monumental step forward.

Some people here seem to predict this heralds the end of language education for most people. Maybe that's true. But I actually think this will contribute to earlier and more frequent exposure to foreign languages. And that may ultimately do more for language learning than compulsory education ever did.

JunkDNA · on Jan 12, 2011

"Google is quick to note that this is very much an alpha feature. In other words, expect a lot of hiccups. They note that background noise, thick accents, and quick speech can all trip up the app."

Wow, so it's not practically useful very frequently?

minalecs · on Jan 12, 2011

dude.. google is getting closer to a real time universal speech translator and you find a way to be unimpressed. What does impress you ?

buro9 · on Jan 12, 2011

Hate to repost it but:

Everythings Amazing & Nobodys Happy

http://www.youtube.com/watch?v=8r1CZTLk-Gk

JunkDNA · on Jan 13, 2011

I didn't intend for that comment to sound quite as snarky as it did. But my point is that real time speech translation needs to work precisely in those situations. Voice recognition and computerized translation have been around for a while. They are nearly always stymied by the problems mentioned in the article we have been "close" to this for at least a decade. What would impress me is if they had developed new algorithms to overcome these exceptionally challenging problems.

jokermatt999 · on Jan 13, 2011

Voice recognition and computerized translation have been around for a while.

Now your reaction makes sense. If you're viewing at as an advance in the technology here (research wise, etc), then this announcement has nothing new. But from a standpoint of anyone with an Android phone having access to this whenever they want (accessibility and ease of use), this is incredible. I think most people are concerned with the latter, but you do have a point about there being nothing "new" here.

kindly · on Jan 12, 2011

My guess would be a babel fish. It only picks up on subconscious frequencies, so outside noise will not get in the way. It also proves the non-existence of god.

brown9-2 · on Jan 12, 2011

Who would have thought that alpha software is in fact very alpha-like in it's qualities?

felipe · on Jan 13, 2011

The "background noise, thick accents, and quick speech" is actually the most difficult part.

Google did not really do anything new. Speech recognition is around for a while, as well as translation engines. Although putting the two technologies on the cell phone makes an impressive demo and good PR, it does not actually solves the problem at a practical level.