Hacker News new | past | comments | ask | show | jobs | submit | rightsForRobots's comments login

This is the best I got. You'll have to imagine what the video was:

https://github.com/ryanjay0/miles-deep/raw/master/images/pre...


Yes even within a particular video there are lots of frames where the act is implied not directly shown, like a close-up of others faces. Karpathy et al. showed they could still learn from the sports video database even with random crowd shots or announcer shots not being removed.

I think the quality for the data influences the result and hand crafting the dataset is what lead to 95% accuracy on new instances.


Good question. There's an example specifically using caffenet to find cats. Any caffe model can be dropped in without recompiling.


This program can classify 1 hour of video in 36 seconds on my low end GTX 960 4GB.


and for PornHub's back catalog, this still take longer than the heat death of the universe


not really. 1 hour of video in 36 seconds it 1,000 hours of video / hour of computation. Assuming you go with a cluster of higher end graphics cards, you could pretty easily perform 100x better. That's 100,000 hours of video processed / hour of computation. I don't know the size of the pornhub back catalog, and I'm scared to search since I'm at work right now, but even if it's hundreds of millions of hours you could go through the whole thing in like 2 months tops.


Isn't 1 hour video in 36 seconds a 100x speedup instead of 1000? Agreed that it's definitely doable if they want.


I think thats doable. I'll be adding an autotag mode. I've been thinking about other attributes I can detect from race, to hair color, to number of participants.

Crowd sourcing is good except it can't tag new videos no one's seen yet.


This program can also be viewed as a general framework for classifying video with a Caffe model, using batching and threading in C++. By replacing the weights, model definition, and mean file it can immediately be used to edit videos with other classes without recompiling


I don't have the rights to the dataset, so unfortunately I won't be releasing it.


Good point. The project mentions this work which did experiments on sports: http://cs.stanford.edu/people/karpathy/deepvideo/


The photos aren't available but the trained model is. It's a set of weights.


Run it backwards in inception mode? I wonder if it could generate porn, or what Gigeresque horrors would come out if somebody tried.


Image Synthesis from Yahoo's open_nsfw (https://news.ycombinator.com/item?id=12756462)


I actually tried that. It's not as interesting as you'd think. Perhaps having an 'other' category makes it more difficult.

It's trivial to drop the model into the deepdream ipython notebook they provide: https://github.com/google/deepdream/blob/master/dream.ipynb


Tried again. Better: http://i.imgur.com/ALoLmcX.jpg


Seriously, though, synthesis using a recognition model can be a good reality check to remind us of the shortcomings of the model's "understanding" of the domain.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: