rightsForRobots's comments

rightsForRobots · on Nov 15, 2016

This is the best I got. You'll have to imagine what the video was:

https://github.com/ryanjay0/miles-deep/raw/master/images/pre...

rightsForRobots · on Nov 15, 2016

Yes even within a particular video there are lots of frames where the act is implied not directly shown, like a close-up of others faces. Karpathy et al. showed they could still learn from the sports video database even with random crowd shots or announcer shots not being removed.

I think the quality for the data influences the result and hand crafting the dataset is what lead to 95% accuracy on new instances.

rightsForRobots · on Nov 14, 2016

Good question. There's an example specifically using caffenet to find cats. Any caffe model can be dropped in without recompiling.

rightsForRobots · on Nov 14, 2016

This program can classify 1 hour of video in 36 seconds on my low end GTX 960 4GB.

thelonecabbage · on Nov 14, 2016

and for PornHub's back catalog, this still take longer than the heat death of the universe

openasocket · on Nov 14, 2016

not really. 1 hour of video in 36 seconds it 1,000 hours of video / hour of computation. Assuming you go with a cluster of higher end graphics cards, you could pretty easily perform 100x better. That's 100,000 hours of video processed / hour of computation. I don't know the size of the pornhub back catalog, and I'm scared to search since I'm at work right now, but even if it's hundreds of millions of hours you could go through the whole thing in like 2 months tops.

GFK_of_xmaspast · on Nov 15, 2016

Isn't 1 hour video in 36 seconds a 100x speedup instead of 1000? Agreed that it's definitely doable if they want.

rightsForRobots · on Nov 14, 2016

I think thats doable. I'll be adding an autotag mode. I've been thinking about other attributes I can detect from race, to hair color, to number of participants.

Crowd sourcing is good except it can't tag new videos no one's seen yet.

rightsForRobots · on Nov 14, 2016

This program can also be viewed as a general framework for classifying video with a Caffe model, using batching and threading in C++. By replacing the weights, model definition, and mean file it can immediately be used to edit videos with other classes without recompiling

rightsForRobots · on Nov 14, 2016

I don't have the rights to the dataset, so unfortunately I won't be releasing it.

rightsForRobots · on Nov 14, 2016

Good point. The project mentions this work which did experiments on sports: http://cs.stanford.edu/people/karpathy/deepvideo/

rightsForRobots · on Nov 14, 2016

The photos aren't available but the trained model is. It's a set of weights.

JulianMorrison · on Nov 14, 2016

Run it backwards in inception mode? I wonder if it could generate porn, or what Gigeresque horrors would come out if somebody tried.

cing · on Nov 14, 2016

Image Synthesis from Yahoo's open_nsfw (https://news.ycombinator.com/item?id=12756462)

rightsForRobots · on Nov 14, 2016

I actually tried that. It's not as interesting as you'd think. Perhaps having an 'other' category makes it more difficult.

It's trivial to drop the model into the deepdream ipython notebook they provide: https://github.com/google/deepdream/blob/master/dream.ipynb

rightsForRobots · on Nov 14, 2016

Tried again. Better: http://i.imgur.com/ALoLmcX.jpg

theoh · on Nov 15, 2016

Seriously, though, synthesis using a recognition model can be a good reality check to remind us of the shortcomings of the model's "understanding" of the domain.