If you have a background picture, you have all the info you need to identify your subject - just plain subtraction. I think this is what the Photo Booth app on my circa-2012 MacBook does, quite effectively.
This is a question we’ve gotten quite a bit (second author here).
A good intuition is that if it were easy to do it already with any background, professional studios wouldn’t be spending so much money on green screens. Background subtraction is pretty poor in general without very constrained setups. Our goal is really to provide professional quality without any of the equipment.
And can your solution do what studios want, namely process a 4K-video artifact free, when played back on a cinema screen? I doesn't look like that tbh if I watch the second video ("Ours real" is your work?).
And yeah, it requires constrained setup and a lot of additional work, because even before you "subtract" the background you have to think about lighting (your demo video might have very nice background matting, but the lighting is off, so it's relatively useless except for toy applications (which there are a lot).
Also: did you compare somewhere withe the very basic fixed-exposure method? Beause for fixed exposure, background and camera placement I suppose this should work just as well... Still I think this is a really cool project, I didn't get disappointed like with the last link of that sort, where someone tried the same with horrible artifacting.
Green screens are crap with hair, because its translucent the green/blue bleeds through which means that it has to be cleaned up by hand.
Then there are the situations where there isn't a green screen. Again manual cleanup is required. Each frame needs to be cut out by hand. 24 times a second.
The same with a difference matte. Cameras are noisy, so there is constant noise in the alpha channel. This makes the effect look wobbly and cheap.
What this method does is pull a key from a difference matte, and makes it look good.
The project page has a video comparison against previous state of the art. You can't just subtract the background if it's not 100% static and stable. Further, the novelty seems to be less artifacts, especially around hair and eyeglasses.
> If you have a background picture, you have all the info you need to identify your subject - just plain subtraction
It's not really "just plain subtraction", it's keying. Which AIUI basically means setting the alpha according to the difference between the image and the reference.
Green screen works well for this because, excepting Zoe Soldana, people tend to hang out around the opposite side of the colour wheel, so there tends to be a good distance between foreground colour and background colour. If you're trying to do this against arbitrary backgrounds, you seemingly need to augment keying with additional techniques like image segmentation to get good results.
This new method works well for partially transparent regions (hair) and allows slightly larger background movement and color overlap between foreground and background.
I think they baited you with their "loook, movement" replacement videos. As far as I can tell, their inputs have a fixed background and are of constant exposure and camera position.
No, the camera is indeed allowed to change a tiny bit. For example, you do not need a tripod. Taking photos with a handheld camera works fine (although a tripod works even better). They explain it in greater detail in their paper: https://arxiv.org/pdf/2004.00626.pdf
Background subtraction methods on the other hand usually fail if the camera moves even a tiny bit or the lighting changes slightly. More advanced methods can recover eventually, but you still get a few frames with improperly removed background.
In the first example (the one with the girl), you can see that there are small camera movements. You can also see the effect this have when applying straightforward background subtraction in the second video.
If you have a background picture, you have all the info you need to identify your subject - just plain subtraction. I think this is what the Photo Booth app on my circa-2012 MacBook does, quite effectively.