Hacker News new | past | comments | ask | show | jobs | submit login
Photo-realistic lip-sync from text (ritheshkumar.com)
154 points by giacaglia on Dec 9, 2017 | hide | past | favorite | 52 comments



I'm enamored and terrified at the same time. As a documentary film editor this is going to literally save my ass for those times when production didn't quite get it right. As a citizen, hacker / child of the 80's whose watched a few successive generations come around I feel it's going to be that much harder to have a good bullshit detector. I've watched in disbelief as my peers believe in whatever happens to be printed and watched culminating into our currrent fake news catastrophe. I predict an even greater rift between the skeptical few and the duped masses. Maybe there's hope? Maybe when nothing can be believed anymore everyone will be forced to become skeptics. I doubt it though. It takes good mentors and willing minds to develop good bullshit detectors. It's sadly not something that is obvious to most in my experience.


> Maybe when nothing can be believed anymore everyone will be forced to become skeptics.

I don't think that's a good thing, though. Right now you can catch a politician in a lie, on tape. Once that's easily faked people will just see whatever confirms their preexisting bias.

It's something more than being a skeptic - when there's no definitive proof of anything, you just pick and choose what you believe.


There was an interesting take on this in black mirror, the idea was that sensor technology would grow quickly enough that you could produce high enough quality fakes to stay ahead.


Great point! Reminds me of The Ouroboros of Scientific Evidence.


I think you could achieve the same result (if not better, since fewer people are likely to sit through a video) with an image of a snappy fake quote next to someone.

People who want to believe "fake news" will believe it, people who don't care, won't care, and people who think it sounds unlikely or don't want to believe it will either disbelieve it or hopefully attempt to fact check it.

To take a more positive view of this kind of tech, I think it has promise in producing satirical content.


>As a documentary film editor this is going to literally save my ass for those times when production didn't quite get it right.

??? Documentary = non-fiction, isn't it? I mean, would you use CGI to show someone doing something you didn't quite record them doing?


This is awesome. My paranoid side is greatly concerned with the recent news that Trump has been denying the Access Hollywood tape to people. With something like this around, he could point to literally anything, say it’s fake, and a significant portion of the population would believe him.


If the last few years has taught us anything it's that Trump supporters don't need any basis in reality whatsoever to maintain a belief in whatever Trump says. Trump could claim aliens from Mars are influencing the mainstream media and /r/the_donald would have a sticky up linking Martians with Clinton before Breitbart was finished typing the headline.


Not sure if you're intentionally referencing it, but NASA did have to deny that it was running a child sex slave ring on Mars during the election after an accusation by an Alex Jones guest.



I did not know that, and it's a sad sign of the times that at no point when reading what you wrote did I think anything more than "huh, figures".


"Before something like this was around, he would point to literally anything, say it’s fake, and a significant portion of the population would believe him."

Fixed that for you.


Wouldn't he be right about it?


Radiolab episode on the subject from earlier this year:

http://www.radiolab.org/story/breaking-news/


It's funny to me everyone says it looks really good. I actually feel like it's surprisingly bad, but it's indeed just a start I guess.


Pretty firmly planted inside the uncanny valley. I think it'll be a while until it's truly convincing, but it should be interesting to follow.


The video claims it's to help people who have lost the ability to speak. Can someone explain how this is supposed to work?

Because it looks more like, and is seemingly marketed as - if a video of Barack Obama saying things he hasn't said is typical - a way to make prank videos.

Edit: And why is it even called ObamaNet? Is it endorsed by the former president?

(I'm not making a political point here, I'd ask the same if it was TrumpNet or GagaNet. Is Obama notably connected with this kind of tech or research?)


I suspect that Obama is quickly becoming the Lenna [1] of synthesized speech. I seem to remember him being used in examples years ago, in addition to the Radiolab episode which uses him as a test case as well.

Not a fan of the ObamaNet naming either - really seems to miss the mark.

[1] https://en.wikipedia.org/wiki/Lenna


If you remember the film critic Roger Ebert, he had cancer in his jaw in 2002. They had to remove his jaw. He was unable to speak intelligibly after that until his death in 2013. He wrote articles right until the end.

If he had access to this, he could have released video reviews, which could have drawn a much larger audience.


There's a popular youtube channel called barackdubs where the creator stitches clips from obama's speeches into pop songs.



They haven't quite gotten out of the uncanny valley here. It sounds like Obama is slurring his speech at times. Still, very impressive, and more than a little scary. I'm sure that with a little more development they'll be able to knock off the last few rough edges, and then we really won't be able to tell truth from fiction in videos any more.


I have to wonder if today's instant and constant news cycle will end up being a brief anomaly, once the technology to literally create "fake news" is a bit more powerful and accessible. Will journalism require much more verification, or will the torrent of crap just become uncontrollable?


Sure, why not? The improvement in tools to efficiently produce and disseminate video are enjoyed by honest journalists too. Many of these official events have more than one photographer/videographer [0]. If only 1 outlet can produce a highly-suspicious video of a public event recorded/observed by dozens of other outlets, then distrust that single outlet. We already have had to do this in a medium that is efficiently easy to fabricate: text.

Society as a whole will just have to realize that video is as alterable as text.

[0] https://www.nytimes.com/2017/06/09/insider/a-photo-of-james-...


This is a fantastic analogy. Since you've pointed out how alterable text mediums are, I'm now thinking of services which provide a ranking of accuracy for text publications. Off the top of my head, I can think of two metrics: (1) The number of sources a publication cites, like a bibliography. (2) The number of publications which refer back to a publication, like PageRank.


All it will take is people to start focusing on creating "fake news" for it to start to appear.

From 1997, 20 years ago: https://www.youtube.com/watch?v=q-ix5IYz0cc


Ask yourself which is more lucrative and you’ll have your(depressing) answer.


There is a saying in Russian that goes "Одно лечишь — другое калечишь" which translates roughly to "You heal one thing, but cripple another". I understand that they see this technology used by people who lost the ability to communicate, but in my opinion, it's far more destructive than it is helpful. Either way, the Obama video is leaps and bounds better than other demos I've seen just a few months ago. Pretty impressive progress.


This sort of thing really makes me feel like the field of "authentication" is about to become much broader and extremely important.


Does anyone else find it amusing it sounds like "liar bird"? I think their intent is very nice, but all I can see in the long run is this being abused by conspiracy theorists and people with an agenda to distort the truth.

People are gullible enough as it is (e.g. 9/11 conspiracists), and usually just want to believe what they already think is true. This will just fuel ignorance and is another reason why it's very important that we somehow as a society get people to think for themselves sensibly.

That said, the demo they showed is very impressive, very good work by them.



Looks pretty good. Their example video sounds terrible and robotic, seemingly by design.

Would the best approach to actually deceiving the viewer be a voice impressionist paired with this technique?


With the rate at which this technology is advancing, I see this as more of an MVP to a much more powerful tool in the not-too-distant future. But I also wouldn't hate being wrong on this one.


The best approach would be (only half-joking here) to wait a year and then use Baidu or a dozen other (Lyrebird included) group's voice synthesis neural nets (generally either a RNN or a WaveNet-style CNN). The last WaveNet samples were indistinguishable from real for me.


The lip syncing tech looks great, and the personality that comes out in the voice would be a dream to have if I couldn't speak for myself.


Now it will get much easier to fabricate statements and much easier for someone to deny something they actually said (by claiming it was fabricated).

This is a Photoshop of voices and as such it is neither more or less dangerous.

We desperately need better ways to detect BS built into browsers, social networks etc.


Turns out the masses have been getting duped for millennia it's so trivial. So the bar is pretty low already.


That is absolutely excellent work - the very best of DL. Keep crushing it, Lyrebird team.


I'm surprised they have blatantly named it ObamaNet. Will it not make them vulnerable to defamation or impersonation or some other legal violation? Is this kind of use of a person's name, face and voice legally allowed in US?


I think they named it exactly this to highlight the possibility of malicious use for defamation/impersonation attacks. Plus if it was really used to say impersonate Obama, it'd be rightly referred by name in the news as "Obamanet", which to a layperson would sound like a smoking gun.

Lyrebird.ai similarly followed this approach as they realized their own technology could be used for nefarious purposes given the current political climate.


I just assumed it was because they trained it on copious publicly available speeches by Obama?


Isn't this the same video from http://futureoffakenews.com ?


I found ObamaNet to be much high quality, to be honest.


These researchers are building weapons. I hope they understand that.


Lyrebird's website leaves me with no confidence whatsoever that they understand the ethical implications of the technology they're developing or that they are working to develop their technology in such a way as to address those concerns.

I hope every company that's working on this stuff either fails, financially or technologically, or ideally gets regulated out of existence, at least until they commit to working on technology, policy and journalistic techniques to mitigate the absolutely inevitable misuse, because as a society we simply are not ready for this. If you work for one of these groups, you should either ensure they're working on said mitigations, or quit.


Technology is always a double edged sword. It's up to all of us to ensure that the good outweighs the bad.


That's a bit harsh... There are always some advances in technology that society needs to adapt to. This is inevitable - and no amount of regulation will help it. It is also not something bad, that's just the way it is and always was.


Do you feel the same way about Photoshop? Society seems to have adapted to that just fine, despite the misuses.


Would you prefer only the Russian government had access to this technology? By making the technology publicly available we can all fully understand its power. It's mathematics, and mathematics cannot be suppressed no matter how much you want it to be.


This isn't new technology. Something similar (audio only) was developed 20 years ago in the US but it "disappeared" from public.


Better the Russians than the Chinese though! Right?


it’s funny they find some convoluted justification saying it’s to help people who can’t speak




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: