Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
MODNet: Is a Green Screen Necessary for Real-Time Human Matting? (github.com/zhkkke)
90 points by homarp on Nov 28, 2020 | hide | past | favorite | 42 comments


History/Context:

How Green Screen Worked Before Computers

https://www.youtube.com/watch?v=msPCQgRPPjI

Hollywood's History of Faking It | The Evolution of Greenscreen Compositing

https://www.youtube.com/watch?v=H8aoUXjSfsI

Zoran Perisic's 'Zoptic' front-projection system used in Superman: The Movie

https://www.fxphd.com/fxblog/effects-of-days-past-making-sup...

https://www.youtube.com/watch?v=mbXC16p8tNc

The technology that’s replacing the green screen

https://www.youtube.com/watch?v=8yNkBic7GfI

Mandalorian 'Stagecraft' technology

https://www.slashfilm.com/the-mandalorian-stagecraft-photos/


some more context: This research is done by SenseTime Labs and CityU of Hong Kong

HN discussion on SenseTime: https://news.ycombinator.com/item?id=17196704 see also https://qz.com/1248493/sensetime-the-billion-dollar-alibaba-... and MIT partnership - https://news.mit.edu/2018/mit-sensetime-announce-effort-adva...

they were also on the Trade blacklist last year - https://news.ycombinator.com/item?id=21195107

Interesting paper: "Eliminating Background-bias for Robust Person Re-identification" - https://openaccess.thecvf.com/content_cvpr_2018/papers/Tian_...


> The technology that’s replacing the green screen [...]

This is only replacing matting (which includes blue screen) in a very specific type of photography for now. Namely when the background is out of focus.


These are great — particularly that first link. What a great explainer.


I can't be the only one who read this as mating


You aren't. In fact, as a non-native, it's difficult to not misread, as I didn't even know what matting was. And "real-time" and "human" go perfectly well with mating.


How is babby formed? Is a Green Screen Necessary for Real-Time Human Matting?


And here I read it as meeting. I'm glad I'm on vacation, but next week is the week of meetings...


This tech will surely be used in a lot of video meetings


It already is, or at least similar tech. I've used it in Zoom and Microsoft Teams to hide my background.


What tech? There is no source on their github and natural image matting has been an area of constant research for 20 years. You can search for bayesian matting, poisson matting and closed form matting for early influential papers.


Same hear, I call them Freudian misreads.


It would be Freudian if there was some subconscious tendency at play to misread it. But here even though "mating" is sexual, the misread is quite easy and effortless because the word "matting" is an obscure industry term that looks totally like "mating"....


Could be tiredness as well. When I saw the headline on the front page earlier, I read the word correctly. When I alt+tabbed to it just now, being tired, I did a double take because I read it as "mating".


In french language we have the "lapsus révélateur" catch phrase to call such slips.


Lol, no. I read it that way too.


I mean, cool, but I was expecting code when I clicked on a GitHub link. Maybe would be better to just link to the preprint instead


Same. A github link kinda assumes a repo of code to poke at.


The real problem is not having first party solution to use our damn phones as webcams across PCs/Macs natively, without resorting to flaking IP cam software or virtual loopback devices which are laggy if they work.


[Disclaimer: Author of Telecast here]

I've been working on an MacOS and iOS app to do just this if you have an older iPhone lying around. https://telecast.camera/

I've been using it to record multicam music performances/practice sessions as I didn't have an expensive DSLR. The latency of other solutions was too high to easily sync guitar strumming and audio-in from an audio interface.

I'd love feedback! Things like the linked network are of interest, as I'd like to add support for background replacement in the future, too.


If you want to transmit the camera stream wirelessly in real time that effectively makes the phone an IP cam, with all the flakiness that often entails. It's not an easy problem.

I'm also not quite sure which problem is being solved by using your phone as a webcam. Actual webcams are quite cheap and come with way better mounting hardware than your phone. And for freehand videos like in the paper the realtime requirement isn't very common, and if it's there it's usually sufficient to do the processing on the phone and stream directly from there. There has to be something I'm not seeing?


Part of it is that webcams are nearly impossible to buy since the pandemic arose, but apart from that, webcam image quality is terrible - even expensive webcams in the price range of ~100's of Euro.

A recent iPhone will wipe the floor with any webcam I've come across, short of an SLR or mirrorless camera.


It may be a little-used feature nowadays, but you can still transfer data using your phone's charging port.

And what about Bluetooth, are BLE cameras also flaky? (I'm sure classic Bluetooth cameras are.)


For what it's worth, the NDIHX software works well on iPhone and Mac through OBS, and is at least available on Windows.

I'd like Macs to have an official loopback device so video effects wouldn't have to be implemented in every program separately. For example an overlay that worked in Facetime/Zoom/Teams/etc.


Isn't that sort of how SnapCamera and the like work today? You give the physical camera to Snap, which does the video processing and then creates a virtual camera device that you can select as your camera in in zoom. It's not exactly as you describe (and FaceTime doesn't respect/offer it), but Zoom doesn't have to implement the effects and can use the device still.


Yes, but it's fragile. Zoom has disabled whatever permission is needed for a virtual camera in the past, and I think Skype requires modifying the binary to enable it. The OBS virtual camera seems to work well now, but that's a recent change and I'm not sure I trust it to keep working on new versions of macos.


I agree! Recently I have been wondering if there is any reason why for example Android devices with a camera don't implement the USB video class specification, except for the reason that nobody cared enough.


Even low end android phones take selfie video of way better quality than any built in laptop Web cam I've had experience with. Also unless you spend 200eu for the top end cams you won't get near that quality. At 200eu you can actually get a pretty decent mobile phone like oneplus nord which also has a decebt camera.


> We then plan to release the code of supervised training and unsupervised SOC in Jan. 2021.

God, I’m so jaded on these promises these days. I hate looking back at old papers that promised to make their stuff available, but never did.


Also there is no code, so posting the GitHub link right now is moot.


I'm sure it's great, but we'll have to wait for the source/demo/pretrained model to really tell.


Proposed timeline from GitHub:

> We first plan to publish an online image/video matting demo along with the pre-trained model in Dec. 2020. We then plan to release the code of supervised training and unsupervised SOC in Jan. 2021. We finally plan to open source the PHM-100 validation benchmark in Feb. 2021.


This sounds awesome so don't take my comment as a criticism, but it is interesting that it highlights a difference between academics and non-academics - academics often publish papers before code, whereas non-academics/hackers publish code, and then probably never get around to writing the documentation :)


Maybe if we can get the academics and hackers to perform some human matting..

[reference to comments about misreading the title]


I think the linked Youtube demo video is pretty impressive, compared to what I see in most video conferencing software.


Generally speaking, reproducibility in research is not the main concern.


There is no code in this repo. Would much rather this be posted when there is something to actually test.


I wonder how this compares to remove.bg [1] (for images) and unscreen.com [2] (for video)?

[1] https://www.remove.bg/

[2] https://www.unscreen.com/


This reminded me a lot of this study: https://github.com/senguptaumd/Background-Matting


Maybe someone here will know this: are there similar things for artificial backgrounds that fit with my face and body position?


Since there is no code here, here is another repo: https://github.com/fangfufu/Linux-Fake-Background-Webcam

I haven't tried it yet, but it supports nvidia GPUs or CPU.


Matting not mating ... Matting ...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: