Definitely not perfect, but this is hobbyist grade work. With the recent work around parallelized WaveNet synthesizing 10 seconds of audio for every second of wall clock time, a live fake that fools 50% of regular people is probably a couple of years away at most. Particularly if you can control the setting to ensure lighting/angles/etc match up reasonably well.
I am starting to wonder if this concept simply works better for some viewers than others. I seem to always very deeply notice that the shape of the head is always sort of wrong... I'm wondering if I use "shape of head and typical hairstyle" as a major way I recognize people. Maybe other people are really focussing on facial features like "lips eyes and nose"? Honestly, what I keep thinking when I see stuff like this is "this just looks like Paul Rudd wearing a lot of makeup" not "this looks like Jimmy Fallon with a narrower head and a different hairstyle".
Definitely not perfect, but this is hobbyist grade work. With the recent work around parallelized WaveNet synthesizing 10 seconds of audio for every second of wall clock time, a live fake that fools 50% of regular people is probably a couple of years away at most. Particularly if you can control the setting to ensure lighting/angles/etc match up reasonably well.