Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
An Algorithm for Nudity Detection [pdf] (vidcat.org)
15 points by symisc_devel on Feb 21, 2014 | hide | past | favorite | 22 comments


When I was in the advertising business, one of the core products I created was a brandsafety product - basically preventing advertisers from advertising on dodgy sites.

I messed around with algorithms that detected nudity (because big brand advertisers don't want their ads showing up on porn sites). One of the more interesting and simple-to-use one is actually a simple averaging of the images across multiple samples. That one was easy to implement and has relatively good results.

In the end though, I ended up not using it because text clustering algorithms worked better in classifying content.


"The training set for the skin filter consisted of 1,182,608 manually labeled skin pixels and 10,471,553 manually labeled non-skin pixels while the testing set consisted of 2,303,824 manually labeled skin pixels and 24,285,952 manually labeled non-skin pixels."

That's a lot of pixels to manually label.


40 million pixels is roughly 40 ~1 megapixel (1280x720) images. It's not that many, especially when it's likely something like this: http://pascallin.ecs.soton.ac.uk/challenges/VOC/images/voc20...


You're right. Didn't do the math.


One word: rectangles.


This could only be a very rough first-pass on detection. Bathing suits can be very skimpy without being fully nude.

And social context plays a large role, for instance distinguishing between a fat male's nipples and a small-chested female's nipples would be impossible without analyzing a lot more than skin color.

http://i.imgur.com/sb6Iw.jpg


Seems like a not very scalable approach to the problem. I would think if you wanted to capture all nudity (including monochromatic or illustrated), you would instead go at the problem from the angle of titillation. You could even round up images that are not necessarily human based (fruit arranged provocatively, for instance).


If the algorithm gets aroused, return positive, otherwise return negative. Bam, patent.


So your algorithm would be a MTurk api?



Uhmm... I think I've seen this somewhere else...


I wrote a really quick Python implementation of this a few months ago, in case someone is interested: github.com/ParthGandhi/nude.py


I'm under the impression that Google's machine learning is already filtering your personal G+ photos for nudity.


maybe the best nudity detector API involves the following:

1. attempt to post the image to google+

2. if the post is there, there is no nudity;

   if it failed to post, there is nudity


So, G+ as middleware?


Does anybody know if there is an API that does this?


how do these nudity detection API work? Is there a crowdsourcing going underneath the hood? Are they using some clustering algorithm to detect a range of skin color (if 90%), it's nude.


The link that you just commented on answers your question in literally the best way possible, it's an academic paper on an algorithm used for this.


To be fair, he may have tried that. The site is completely unusable on a phone as it seems to make assumptions about the minimum screen resolution and doesn't allow you to zoom out like most desktop sites on a mobile screen do. In fact it's one of the most poorly designed sites I've stumbled upon in a long time in that regard (which is a great pity as the content looked interesting)


If you click the text "An Algorithm for Nudity Detection [pdf]", that's actually a link.


Protip: If you film pornography in black and white, there is no such thing as "skin color."

If you remap the palette for a desaturated image, so that everyone's skin is green or magenta, are there any fewer penises penetrating vaginas in the image? If it's a horse and a fully clothed person's mouth, where is the algorithm for that?

Think about it.


How many pornography images do you see on the internet that are taken in black and white? How many are modified so that the people involved look like the hulk? An algorithm like this does not need to be 100% bulletproof in every situation imaginable, it just needs to work most of the time.

Think about it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: