Facebook open-sources Detectron

evmar · on Jan 23, 2018

Also noteworthy: the Apache 2 license, which includes a patent grant (unlike the previous Facebook licenses that have caused concern in the past).

haeffin · on Jan 23, 2018

caffe2 (which this is built on) was switched from bsd+patents to Apache2 a while ago too.

zitterbewegung · on Jan 23, 2018

I think if either a project that FB uses upstream has Apache 2 or if you are a large organization (for example the Apache Software Foundation) you can use your clout to force Facebook to use Apache.

traek · on Jan 23, 2018

Caffe2 is a Facebook project, so they’re not exactly being “forced” to release it or any downstream projects under Apache 2.

megous · on Jan 23, 2018

So is this the end of Google captchas asking for where the car/sign/whatever is? Will there be a final battle of AIs, where they will kill each other, and the unfettered access to websites over VPN/tor wins and laughs the last laugh?

kurtisc · on Jan 23, 2018

For the car captchas, I've found actually clicking all the boxes with part of a car will always be a wrong answer (distinct from when it just makes you answer twice). Instead, you have to click on the squares that you know it thinks are cars.

This creates a twisted Turing test situation where, to prove you are a human, you have to pretend to be a machine's idea of what a human is.

bildung · on Jan 23, 2018

It was the same when this was about words from old books. I always had to fill in letters the average person would have thought it to be, not what it actually was (e.g. the letter "f" for what really was an "s" in gothic type).

Nowadays it's much easier, you can click anything that looks vaguely the same (e.g. boxy things for cars, ads for traffic signs, traffic signs for store fronts etc.). The fact that it's so easy to poison the training set makes me very wary about the autonomous car future...

MisterTea · on Jan 23, 2018

I actually like poisoning them. Not to be malicious but I feel manipulated into training their software for free. "Oh you wanted to sign up for that web forum? Sorry, but you have to do some free work for us first"

And if you think that it's somehow good because it's mutually beneficial to train AI to better the future of humanity, don't. That is what their marketing department wants you to think.

dazmax · on Jan 23, 2018

The value they're providing is to the forum owner, who has reduced spam-handling workload.

So the forum is providing you value, you are providing Google value, and Google is providing the forum value.

kuschku · on Jan 23, 2018

Sure, but if the public has to provide Google with free data, we should make laws requiring Google to open source the entire ReCaptcha training set.

operon · on Jan 23, 2018

kuschku · on Jan 23, 2018

Anything created with free work should be free in return. Anything created by the public should be available for the public.

sangnoir · on Jan 23, 2018

To use an in-thread example - online forums are created with free work. Should all forums be forced to make their archives available for free download as well?

kuschku · on Jan 25, 2018

For free download? No. Should I be able to scrape them? Yes.

stolsvik · on Jan 24, 2018

Trundle · on Jan 23, 2018

It's not free though. You get access to the forum.

If it was free then you wouldn't be doing them!

snupples · on Jan 23, 2018

Or it could be both what their marketing department wants you to think, and also reasonable.

wott · on Jan 23, 2018

I have more problems with bridges than with cars. The damn thing forces you to select 3 bridges, except that there are only 2... So you are forced to select what it thinks is a bridge, and confirm his erroneous bias even more.

Chaebixi · on Jan 23, 2018

Storefronts are difficult too, I don't think I'm ever good enough at those to satisfy it. The most reasonable one seems to be street signs, but I think it fails me for not flagging the unpainted back of one.

zhte415 · on Jan 24, 2018

Sounds like poor question asking. "Click any square with a bridge" would be better wording.

robryan · on Jan 23, 2018

I had to do 20 or more of these to get some post tracking data recently. Found the same thing with most objects where a very small part of an object hadn’t been classified as containing that object.

0xTJ · on Jan 23, 2018

Does it tell you that you're wrong, or simply give you another set. If it is just giving you another set, it may be that it thinks that you can provide more useful data.

csallen · on Jan 23, 2018

It just gives you another set, but it's so frustrating and annoying as a user that I can't believe they're doing this on purpose.

Chaebixi · on Jan 23, 2018

If you fail, it tells you you were wrong (some red text at the bottom) and gives you another challenge. I think it will sometimes just give you another challenge for more data, but it won't have red failure text.

Chaebixi · on Jan 23, 2018

> This creates a twisted Turing test situation where, to prove you are a human, you have to pretend to be a machine's idea of what a human is.

Exactly. I think Recaptcha was better when it was looking for consistency with other human answers. Using "AI" has the same problem you mentioned, plus its more vulnerable because it has the assumption that your "AI" is unapproachably far ahead of competitors.

nerdponx · on Jan 23, 2018

Street signs are a problem as well. Is the post part of the sign or not?

aswanson · on Jan 23, 2018

Twisted Turing test. There's a novel waiting to be written about this.

thom · on Jan 23, 2018

I went through exactly this today. Wasn't sure if it was the computer being dumb, or other people missing corners of signs or bridges etc. Still, takes me 5 goes every time.

pbhjpbhj · on Jan 23, 2018

The problem with the cars for me is their definition -- is that van classed as a car, how about the half-back, what about a 4x4, what about a 4x4 with no back windows, ...

indigochill · on Jan 23, 2018

What about a square that contains just a sliver of the car from the next box over? Does that still count? I hypothesize that my attempt to classify every pixel related to a car as a "car" may contribute to my failing of these tests.

toyg · on Jan 23, 2018

This is what kurtisc is saying. The algorithm is unable to classify those pixels, so you have to guess which bits of the car can be detected by the algorithm and then select only those. So you have machines asking humans to think like a machine in order to prove they are human.

js8 · on Jan 23, 2018

That was my experience with car and sign captchas as well..

But interestingly, it also depends on my mood, when I feel lazy, I click fewer boxes.

zer0t3ch · on Jan 23, 2018

Isn't it supposed to learn from you? IE, answering technically correctly but what it thinks is incorrect is slightly annoying for you, but better for the system in the long run. (ie better for everyone)

moreless · on Jan 23, 2018

Everyone? Or Google? I don't feel particularly happy about being used as a lab rat, so no, thank you - I don't care about the quality of Google's AI. If anything, I would purposefully mislead it if I knew how.

amrrs · on Jan 23, 2018

Don't you think that's how a lot of training data can be generated efficient for future ML/AI breakthroughs?

kbart · on Jan 23, 2018

So we will get even more "targeted ads" in our faces? No, thanks. I think ML/AI has a great potential (especially in medicine), but I just don't trust ad companies to use it for any good cause.

JshWright · on Jan 23, 2018

Has Google published this training set somewhere? Until they do, you're absolutely right that this is a great way to build a training set, but I don't see how it's to anyone's benefit but Google's.

dboreham · on Jan 23, 2018

I find the same thing with the road signs : other people are giving (imho) the wrong answer by not including squares that cover a small, but non-zero part of the sign.

jraph · on Jan 23, 2018

Has someone ever tried to submit images from Google captchas to Google Images?

An answer like "This is definitely a sign" from Google Images would be funny.

Jaruzel · on Jan 23, 2018

I just tried it:

http://www.jaruzel.com/files/streetsign.jpg

:)

yorwba · on Jan 23, 2018

Of course if you start googling for street signs a lot, they'll hit you with a captcha for each search, so that you can't ask Google to solve their captchas for you until you solve their captchas.

neogodless · on Jan 23, 2018

Now try it with the Speed Limit sign. Does that count as a "street sign" or only ones with street names?!

nannal · on Jan 23, 2018

Your SSL cert is for the wrong domain.

Jaruzel · on Jan 23, 2018

Your https-everywhere is doing it wrong. I don't have an SSL on my personal site (no requirement to). I do have SSL on other domains that share that IP however.

ddevault · on Jan 23, 2018

The idea that you have no requirement to is wrong - your site may not have sensitive information but without SSL it can be MITM'd and used as a vector for malware, ad injection, etc.

Jaruzel · on Jan 23, 2018

You argument is valid (somewhat), but I don't think that I attract enough traffic on a constant basis, to invest the time and effort to SSL up all my domains.

So far I've had a less than stellar experience with letsencrypt, so I'm not quite ready to go all free-certs just quite yet. It also requires a rebuild of my web-server[1] which I've been putting off for a very long time already.

---

[1] See my other post in this thread.

Lutin · on Jan 23, 2018

Why not just use Cloudflare? They do SSL termination for free, plus you'd get caching to boot.

ddevault · on Jan 23, 2018

So long as your site doens't have SSL you shouldn't be linking people to it. You're under no obligation to drop everything and fix your web server security but you should really stop using it until you do.

Spare_account · on Jan 23, 2018

I'm not expert on this but what I'm seeing is that port 443 is serving up a response when jaruzel.com is requested on that port. While you may not be actively advertising that domain with https URLs, it is valid for clients to request one speculatively.

There isn't a valid cert for that domain and for some reason for server is offering a different one. Presumably you need to unbind 443 from that host header name (this is based on memories of configuring IIS a decade ago).

zer0t3ch · on Jan 23, 2018

"port 443 is serving up a response when jaruzel.com is requested on that port"

The only response is a 404, which is exactly what should be displayed (to the best of my knowledge) for a domain that isn't configured for that IP/port when there are other sites utilizing that IP/port.

Spare_account · on Jan 23, 2018

Oh dear, this hasn't gone very well has it. I'll have another look when I'm home, I thought I was closer to the mark. Thank you for the response.

Jaruzel · on Jan 23, 2018

So...

I have an IP... that IP points to a router, that router port-forwards ports 80 and 443 blindly to a web server, on that web server is a bunch of websites. IIS knows which ones to serve to clients based on a) the host-header, and b) the port.

jaruzel.com:443 is not valid, but because I run an older version of IIS[1], that does not support SNI, the cert is bound to the port, not the host-header. As such any domain name that points to the IP will dump you at that cert if you try to connect on port 443.

Hope this clears up any confusion. :)

---

[1] for um... reasons.

Spare_account · on Jan 23, 2018

Thank you, I appreciate the detailed response :-)

_ofdw · on Jan 23, 2018

Complaining about a non-existent SSL cert and then backseat driving the "fix", using words like "you need to", all based on shady memories of configuring IIS a decade ago?

Really?

Spare_account · on Jan 23, 2018

Just having a chat about it really, sorry it rubbed you up the wrong way.

Jaruzel · on Jan 23, 2018

I'm cool with it. I like these sort of side-bar conversations.

louis-paul · on Jan 23, 2018

Yes: https://www.blackhat.com/docs/asia-16/materials/asia-16-Siva...

schrep · on Jan 23, 2018

This is some of the most advanced work out there - but CV is not “solved” most vision systems only can label about 1k categories of objects. So capatchas can still be easiy constructed that would fool these systems. Part of why it is exciting to get this out there others can help us improve it.

dx034 · on Jan 23, 2018

But how many more classes can all users solve? If you have to identify all oaks from a group of trees I doubt many would solve that correctly.

m_ke · on Jan 23, 2018

Imagenet is the only reason why most models are trained on 1K of categories. There's plenty of models in the wild that handle 10s of thousands of classes.

schrep · on Jan 23, 2018

With what precision and recall?

m_ke · on Jan 23, 2018

Not that much worse than what you see on imagenet. Most large companies have internal datasets with >100 million images.

https://arxiv.org/abs/1610.02357

schrep · on Jan 23, 2018

That’s impressive work. Still don’t think we have reached human level for all the categories of things we see in images. But you are correct that my comment about 1k categories is not true for many production systems.

jph00 · on Jan 23, 2018

Whilst there are plenty of things in CV that computers aren't super-human at yet, object classification (given 100+ examples) is not one of them. In datasets with tens of thousands of categories, humans are much worse than computers - e.g. humans are really not good at knowing the difference between every type of mushroom, algae, and model of airplane.

Further, nearly every time a computer recently has been trained to do some very nuanced classification, such as in radiology, they exceed human expert performance.

(Outside of classification, computers are rapidly making progress - for instance they are getting surprisingly good at predicting the next few frames of a video, which requires a lot of "world knowledge" to do correctly.)

m_ke · on Jan 23, 2018

Definitely not close to having things work for all categories. As you scale up to more categories ambiguity and specificity becomes an issue. Clarifai has a nice demo of their model which has >10K classes, https://clarifai.com/demo , the top predictions are usually correct but not always the most relevant.

I only linked to the xception paper because it mentions JFT. It's not state of the art for large scale recognition.

visarga · on Jan 23, 2018

It's not just a matter of detecting objects and locating them. The deeper computer vision problem is to identify object attributes, relations between objects and actions in video. It's much harder to do that because many relations appear in very diverse situations, with objects of different categories, so it's hard to have 1000's of examples for each class of relation.

For example, humans can identify a monkey riding a Segway on the airport runway, but there probably is no such thing in the training set, even if it is quite large. The neural net might not know if that constitutes a "riding" action because it has never seen such a combination. Maybe the monkey is jumping over the thing and the picture shows it in proximity to it, not riding it - a human would know that a slight gap means there is no riding taking place.

Then, the even harder problem is to predict the consequences of actions on objects and just to physically simulate the scene. Such knowledge is useful in robot action planning. Beyond computer vision, there is also a need to create a "mental simulator" that has theory of mind and can simulate other agents (what humans intend), and we need simulators, both physical and mental to create the next level of AI.

humanbot · on Jan 24, 2018

Interesting. Can you teach me what is the state of the art for large scale recognition? Would like to read more about it. Thank you.

adrianN · on Jan 23, 2018

You don't need human level to make CAPTCHAs useless. If you can break the CAPTCHA 10% of the time that's already enough.

itsa2 · on Jan 23, 2018

In case you didn't realize, the guy you're talking to is the CTO of what some may call a "large company."

schrep · on Jan 23, 2018

Doesn’t mean I’m right :) - see above as he had a point.

mcintyre1994 · on Jan 23, 2018

Assuming from context you're CTO of Facebook, the facebook.com/schrep link in your profile isn't working:

This page isn't available. The link you followed may be broken, or the page may have been removed.

spyder · on Jan 23, 2018

The link only works if you are logged in, otherwise it says the page not found, which is wrong message because it makes you think it doesn't exists even if you login.

mcintyre1994 · on Jan 23, 2018

Worked when I logged in, thanks :)

Kiro · on Jan 23, 2018

It works for me.

tvmalsv · on Jan 23, 2018

And I enjoyed the conversation between two people that obviously know the field pretty well.

m_ke · on Jan 23, 2018

Wow that's a great catch.

kbenson · on Jan 23, 2018

Google doesn't even show a catpcha if they have enough tracking info for toy to verify you're a human, which is pretty simple for them if you don't clear all your cookies for a few days. I'm pretty sure Google thinks if they have to show you a catpcha they've failed, but along with that they don't feel the need to make the catpchas particularly easy if they do have to show it.

culerawo · on Jan 23, 2018

Except when it doesn't. Every so often it suddenly spams one with sheer number of captcha on sites that use the Google's Recaptcha API. Then you have to solve detect the houses, cars, store-fronts, deceiver plates and street signs several times successfully to move on or be prepared to solve several more captches to be allowed to move on. I would wish Google fix Recaptcha. It used to be so good.

mpeg · on Jan 23, 2018

Are you using an adblocker or any other privacy extensions (PrivacyBadger, Disconnect, etc.) ?

If you are, Google will spam you to death with captchas; it kinda makes sense because captchas are getting easier to solve for machines, so apparently the new test of humanity is whether Google can track your activity on other sites.

pbhjpbhj · on Jan 23, 2018

Whenever I get spammed with captchas I assume the site is somehow redirecting them to me from a bot.

beobab · on Jan 23, 2018

I hate it when I fail the Turing test, and can't log in to stuff.

culerawo · on Jan 23, 2018

Obviously I am 99.99% sure I am clicking correctly, yet Recaptcha displays me another and another captcha, until it lets me move on. Is it a system bug or maybe their bot detection got quite unreliable?

lotsofcows · on Jan 23, 2018

Neither. When Google finds someone willing to train its AI for free, it likes to take advantage.

culerawo · on Jan 23, 2018

You are a bit sarcastic. What can a user do? Either he continues solving "train their AI for free" until the let you move on or you leave the site?

megous · on Jan 23, 2018

You can complain to the site operator. Google is not forcing their captcha on websites. If enough people do that, perhaps site operator(s) will take notice. Tell them exactly what you don't like so they don't eventually change to different vendor, but same annoying captcha tech.

beobab · on Jan 23, 2018

I remember one particular set of pictures with both buses and coaches (and some with neither) on it, and it asked me to click on the buses. It said I'd got it wrong, but presented an almost identical set afterwards, and I included the coaches, but that was also wrong.

kbenson · on Jan 23, 2018

I've seen this happen, and I think it was because the IP address was previously used for craping of some sort, or somehow set off some flags at Google. VPN providers can cause this because sometimes their IPs are used for just such things.

jacobush · on Jan 23, 2018

It doesn't seem to matter which images I click, it lets me in after a while anyway.

TheAceOfHearts · on Jan 23, 2018

I must be really unlucky in that I regularly get pretty challenging captchas which require a lot of tries before it lets me through. It's frustrating enough to make me avoid future visits to sites which use their captchas.

stochastic_monk · on Jan 23, 2018

I take it you don't use a VPN. I often have to pass a captcha just to use google search when in an IPv4 address. (Considering that ISPs are now allowed to sell my data, using a VPN seems to be an obvious choice.)

It's annoying to the point that it's pushed me to use DuckDuckGo more often and I tend to avoid platforms that require me to continually take their captchas. I used Discord for a little while, but once it started asking me to verify my humanity again periodically per session, I booked it.

crtasm · on Jan 23, 2018

Have you tried Startpage? Lets you Google search via their proxy servers, no issues with tor/vpn that I recall.

stochastic_monk · on Jan 24, 2018

This might be exactly what I need. Thank you!

kbenson · on Jan 23, 2018

So, you're traffic is coming from some provider that purposefully obfuscates info and your traffic is mixed with a bunch of other people's? I can't imagine why they view you as less likely to be verified as a real person...

stochastic_monk · on Jan 24, 2018

I'm not saying that it's an unreasonable assessment on their part, but it is a large annoyance.

My question is: are they doing this to simply get more training data for image classification, reduce server load by minimizing automated traffic, or to sanitize their queries for human input for NLP models?

kbenson · on Jan 24, 2018

> My question is: are they doing this to simply get more training data for image classification, reduce server load by minimizing automated traffic, or to sanitize their queries for human input for NLP models?

As someone that works in an industry where CAPTCHAs have historically played a large role, and some players flat our use technology to bypass them, and do so using proxy and/or VPN services to get good IP addresses to do so, I imagine those automated systems both corrupt the CAPTCHA system somewhat, since it looks like a large corpus of humans behave in a certain manner and it's not humans at all. It likely also causes those IPs to be considered by the CAPTCHA system as highly suspect whenever encountered.

For your next questions, the industry is event ticket resale, and no, we don't do that (there are aboveboard ways to function in this market that rely less on brute force and more on data mining and analysis for specific targeted investment, and sometimes long after it's been on sale).

vinniejames · on Jan 23, 2018

No, they are like using that data to build self driving car algorithms. I would imagine that's why it's always asking you to detect roadsigns

narrator · on Jan 23, 2018

The future is going to be your corporation/country/blockchain's AI vs your adversary's corporation/country/blockchain's AI with vast numbers of humans in the middle of the whole sh*tstorm just trying to survive and live a tolerable life.

visarga · on Jan 23, 2018

That's good, though, I wouldn't want a future where a single AI has all the power. We need a multitude of AIs to increase equality for humans and diversity for AI.

asfdsfggtfd · on Jan 23, 2018

More likely websites will just block Tor exit nodes + whole swaths of IP space (i.e. all AWS+GCP+Azur+DO etc IPs).

freeflight · on Jan 23, 2018

Yup, this seems to be a pretty big factor. When I use a VPN to surf the web I'm usually stuck in captcha hell for the most simple and mundane things.

dokument · on Jan 23, 2018

Amazon will have a new captcha asking how many boxes of cheeze-itz this person is holding on their way out of a store.

jamesmcintyre · on Jan 23, 2018

I get it. lol

yohann305 · on Jan 23, 2018

my overall feeling as someone that wants to start getting into visual recognition is that there are a bunch of great libraries/ecosystems to choose from and all of them have pros and cons, but i honestly don't want to make the wrong decision and end up being stuck later on. Anyone here has any advise on what i should use to have a camera(rpi) recognize most common objects and then add a layer where we can teach specific objects, ie (putting a name on a person or a pet), thank you!

dapreja · on Jan 23, 2018

You should be set with openCV or JavaCV. JavaCV has more ported image/video progressing libraries than openCV and are a bit more customizable in my experience. And literally JavaCV has openCV natively but a different wrapper. I would only suggest openCV if you are working with python. It provides enough tools for basic then mid complex objects and if you want to mature in it then open up the hood. As for porting that pet porject to a mobile android device i would suggest to go straight with javaCV even though openCV has wrappers for android and java. The reason is because at some point you'll want to ditch their riggid methods of obtaining video/pictures. JavaCV has served me well when integrating with android. Also, if you eventually want to scale up the processing into a web service then it' easier.

fnbr · on Jan 23, 2018

Not sure if you're looking for CV or DL libraries. If you're looking for DL libraries, you can't go wrong with Tensorflow or PyTorch for research/development, and Tensorflow or Caffe2 for deployment. Tensorflow's a bit difficult to learn, but has a lot of great tooling around it, and PyTorch is the opposite. The other frameworks are fine, but don't have the same amount of documentation & beginner resources.

2bitencryption · on Jan 23, 2018

unless I'm mistaken, this is the very first lesson of the fast.ai course.

apendleton · on Jan 23, 2018

The first lesson is image classification ("is this a picture of a cat or a dog?"). Given that OP is commenting on an object detection library release, though, I assume they're interested in object recognition/detection/segmentation and rather than just image classification. So, more like: "what things are in this image and where are they?" or even just "where are the dogs in this image?"

That's also covered eventually in fast.ai, but not until the second course if memory serves.

newscracker · on Jan 23, 2018

This looks amazing from the computing point of view (and is an achievement of sorts), yet the confidence percentages are lower than what an average human might be able to solve for (like in the case of CAPTCHA tests).

Meanwhile, I wonder about the human costs if systems like these are adopted for purposes where they may be ill suited for, especially cases where their confidence scores are ignored (or mistakenly assumed to be 100% even when they're lower). Anyone have reading material on this?

notyourwork · on Jan 23, 2018

Care to give any examples of scenarios you are concerned about?

newscracker · on Jan 24, 2018

My primary worry is law enforcement and government surveillance considering these systems as infallible and making judgments or life-changing decisions based on interpretations like this from computers. Computing has improved our lives a lot, but sometimes I feel there's an air of over confidence that clouds our judgment.

franciscop · on Jan 23, 2018

Does anyone know an alternative that works on RaspberryPi? This states: "Detectron operators currently do not have CPU implementation; a GPU system is required."

Even low FPS (3-5) would be acceptable.

m_ke · on Jan 23, 2018

You could try tensorflow object_detection api with tensorflow lite

https://github.com/tensorflow/models/tree/master/research/ob...

google also recently put up their mobilenet v2 paper which handles segmentation https://arxiv.org/abs/1801.04381

deepGem · on Jan 23, 2018

+1 for the Google object detection API. The trained model is quite huge though. 200 MB based on Resnet faster R-CNN. There are creative ways of chunking this model to keep it small.

m_ke · on Jan 23, 2018

I think they have a mobilenet SSD model as well.

fmntf · on Jan 23, 2018

With such deep networks, I think it would be hard to get 3-5 FPS on an Intel iX; forget the Raspberry PI CPU!

kurtisc · on Jan 23, 2018

FWIW, A Pi does have a GPU with a full OpenGL ES implementation. However, this requires NVIDIA CUDA.

bogdan · on Jan 23, 2018

Maybe https://pjreddie.com/darknet/install or one of the various forks. YOLO is supposed to be one of the best object detection systems as far as accuracy/speed is concerned.

ajtulloch · on Jan 23, 2018

Look at the first table in https://arxiv.org/pdf/1708.02002.pdf (the RetinaNet paper, code + models included in this release). There’s a substantial accuracy cost for YOLO vs other, similarly fast (within a factor of 2 or so), methods.

bogdan · on Jan 23, 2018

> similarly fast

Unless I misunderstood figure 2, YOLO seems to be more than twice as fast than the second runner and yes I agree and already was aware that YOLO's accuracy is not as good as other classifiers.

pilooch · on Jan 23, 2018

See https://github.com/jolibrain/dd_performances you can reach 1fps on image classification and objects detection tasks on rpie3 with custom caffe and models.

vitorgrs · on Jan 23, 2018

There's CNTK, but I don't know how well does it work :) https://github.com/Microsoft/CNTK

haeffin · on Jan 23, 2018

It's a bit disappointing, when caffe2 was released it was stated that mobile is a big focus, but things like this don't support mobile (even though some of this was demoed by FB on a phone).

ajtulloch · on Jan 23, 2018

FWIW we have some similar R-CNN models using this codebase running in the FB apps on iOS/Android with Caffe2 - with some neat bells and whistles (eg full execution of the compute graph on the device GPU where available via Metal/OpenGL). I’ll look into adding a mobile tutorial for Detectron if you’re interested.

haeffin · on Jan 23, 2018

That would be awesome. But - is a tutorial enough? From what I gather from the readme the custom ops have cuda implementations and no CPU ones, which you'd like to have as a fallback on mobile (even if metal/opengl implementations exist), or am I wrong?

ajtulloch · on Jan 24, 2018

They exist and are (nearly all) open-sourced for CPU & Metal (I wrote some of them).

haeffin · on Jan 24, 2018

Nice - the "Detectron operators currently do not have CPU implementation; a GPU system is required." on https://github.com/facebookresearch/Detectron/blob/master/IN... is outdated then? Or is it the "nearly all" that's behind this statement?

stzpz · on Jan 29, 2018

We have open sourced several operators for detectron so that the model could run on GPU/CPU/mobile (https://github.com/facebookresearch/Detectron/commit/757d77c...). We also open sourced a conversion tool (https://github.com/facebookresearch/Detectron/commit/757d77c...) to help converting the model.

lovelearning · on Jan 23, 2018

Not the OP, but +1 from me for such a mobile tutorial. And thank you in advance!

eggie5 · on Jan 23, 2018

i ras running a NAS-based object detection model on my big-new MacBook and it was taking about 30s/image on an unoptimised tensorflow build. I then tried a model/net model which took about 3-4s/image.

dapreja · on Jan 23, 2018

Use your last android phone.

drdrey · on Jan 23, 2018

> Beyond research, a number of Facebook teams use this platform to train custom models for a variety of applications including augmented reality and community integrity.

Any idea what they mean by "community integrity"?

readams · on Jan 23, 2018

detecting porn, presumably.

adventured · on Jan 23, 2018

I would expect it to work on detecting the mismatching of content for types of communities in general. For example, preventing pictures of cats or giraffes being uploaded as a product photo on Poshmark when it's supposed to be a pair of shoes.

That type of check should become standard in a short amount of time for all communities that accept photos (that isn't meant to be general purpose, eg Imgur).

schrep · on Jan 23, 2018

For example marketplace (where you can sell items on Facebook) will suggest a category for the item if you upload a photo.

paulie_a · on Jan 23, 2018

I can't wait for Facebook marketplace to fail. The constant stream of useless ads is obnoxious. I am using Facebook less because there is no way to block that low quality content

hultner · on Jan 23, 2018

When could counterfeits be detected with a somewhat decent accuracy? Is it something that has been attempted?

I checked out Facebook Marketplace a few times since the launch and and everytime I'm just overwhelmed by the sheer amount of better or worse counterfeits.

On top of that I start getting notifications more counterfeits, for a short while I reported them but after a while it felt pointless and now I just ignore the marketplace tab.

m_ke · on Jan 23, 2018

A few craigslist competitors are using CV to detect drugs/weapons/animals.

eb0la · on Jan 23, 2018

Maybe you want to censor internally part of an image for privacy reasons.

For instance, in my country you cannot use or publish children images without parents consent.

The fine for doing that is way higher than your benefits even discounting bad press.

drdrey · on Jan 23, 2018

What country is that?

tomschlick · on Jan 23, 2018

Hotdog / Not Hotdog

JepZ · on Jan 23, 2018

Anybody knows a way to run CUDA programs with the open source driver (nouveau)?

yorwba · on Jan 23, 2018

CUDA needs driver support to talk to the GPU, and since it is proprietary nVidia technology, the open-source driver can't support it. So either you run the nVidia driver or you have to use OpenCL.

skate22 · on Jan 23, 2018

I had to specifically disable nouveau to get cuda to install correctly on ubuntu 16.04. You need an nvidia card and drivers that are new enough to run the later versions of cuda.

aaroninsf · on Jan 23, 2018

Can someone with GPUs and love in their hearts, bundle this with trained models in a Docker container?

(serious request... I got a cluster, and something like a million pictures; but no GPUs or time for another side project...)

burningion · on Jan 23, 2018

I’ve started working on this, it seems the current Dockerfile for caffe2 doesn’t work out of the box because of a forced push.

Follow me on Twitter, and I’ll post it there when it’s finished. Same username as here.

* edit: I've put a pull request in that builds the Dockerfile for the GPU for now: https://github.com/facebookresearch/Detectron/pull/15

candlefather · on Jan 23, 2018

Reminds me of this pic from Terminator https://s3.amazonaws.com/pbblogassets/uploads/2015/08/Termin...

ry_ry · on Jan 23, 2018

Accompanied by endless banner ads for your clothes, your boots and your motorcycle.

make3 · on Jan 23, 2018

huh it outlines even through other objects

yters · on Jan 23, 2018

Is there a class of math problem humans can solve but computers cannot? Then we could just use these problems as a guaranteed test instead of the current CAPTCHA arms race.

m_ke · on Jan 23, 2018

There is no arms race. 99.9% of the time Google knows if you're a robot based on your browser state. They make you label the images because it's a free way to get training data.

gldalmaso · on Jan 23, 2018

I see this argument a lot and considered it the case to, but I have to wonder if that really is the case. Wouldn't google be empowering people to really mess up their training data?

If I'm trying to automate a system to fool their captcha, I'm probably getting a lot of bad results. Or I could just be intentionally feeding them bad data, the fact that not being allowed through captcha keeps letting me make more and more inputs would enable someone to do that as long as they would like to.

I don't know, maybe I'm missing something.

kabes · on Jan 23, 2018

I believe they cross-check between different users. I believe it used to work something like this on the old recaptcha system (the one with the words or house numbers): They show you 2. 1 is known by recaptcha, the other one isn't. If you enter the known one correctly, you can enter. The unknown one is presented to other users and when there is enough consensus amonst users it is promoted to a known one. So it's hard to mess with the system as an individual.

croon · on Jan 23, 2018

> I don't know, maybe I'm missing something.

The thing you're missing is volume. Even if you assume the vast majority of people will attempt to mess up your data, when you have enough people doing it, you can look at them on aggregate and based on patterns disregard bad data. It might be "expensive", but still worth it.

skate22 · on Jan 23, 2018

Ive seen bot detection arms races in video games first hand, and there is a lot more to gain on webstes. Surely there are engineers smart enough to combat google's bot detection, and with massive sets of labeled data google stands a fighting chance.

paulie_a · on Jan 23, 2018

Then why does Google use the world's worst captcha for their own services?

ftoo · on Jan 23, 2018

How does that work? Could we create an open source lib to replace Recaptcha?

kuschku · on Jan 23, 2018

Google has basically your entire browsing history, and search history. Google then compares that with what is considered normal for a human, and then lets you in if you pass that check.

Of course, if you run IRC bots that scrape Google with a headless browser to implement a .search functionality, and which offer link titling in IRC, and you use a separate bucket of cookies and IP for every IRC bot, your bots now also have a human search and browsing history, and also will pass all ReCaptchas...

ibdf · on Jan 23, 2018

I was just looking into trying out YOLO. Does anyone know how both compare?

xiphias · on Jan 23, 2018

I'm happy that tech companies are open sourcing basic research all the time, and thinking a lot about what would have happened if large pharmacy companies did the same thing. I'm just hopeful that with new biotech companies the science behind curing people will get faster as well.

fulafel · on Jan 23, 2018

Very sorry to nitpick, but this and pharma research are applied research - basic research is things like string theory and abstract math. See eg http://www.sjsu.edu/people/fred.prochaska/courses/ScWk170/s0...

Companies rarely do basic research, and that's why it's very important to keep up public funding for it.

visarga · on Jan 23, 2018

> thinking a lot about what would have happened if large pharmacy companies did the same thing

There is a company creating a 3d-printed chemical reactor. By downloading a schematic and buying some raw substances, you can create your own lab. It can be used to synthesise drugs in remote areas, such as on Mars, or to make generics for cheap. The exciting part is that the reactor schematic can be downloaded and shared easily. It can also make illegal drugs just as easily as 3d-printers can print guns.

http://www.sciencemag.org/news/2018/01/you-could-soon-be-man...

ejstronge · on Jan 23, 2018

Unlike the case in tech, pharma basic research is far less important in advancing our knowledge when compared to academia. A good example comes from the last few blockbuster cancer therapies - CAR-T cells and checkpoint blockade all arose in academic labs.

Also, for drugs that do make it to market, efficacy and side effect information is published as a condition of drug approval, at least for new drugs.

Whether basic science research papers should be behind a paywall is a wholly separate issue, but the life science community largely shares its finished products. Indeed, there’s even a push to share early stage data, too.

geephroh · on Jan 23, 2018

Sounds like you might enjoy Annalee Newitz's _Autonomous_ (https://www.techsploitation.com/#/sciencefiction/)

fjsolwmv · on Jan 23, 2018

Code is cheap. Training data is expensive

shitals · on Jan 23, 2018

Not if you have simulator :)

https://github.com/microsoft/airsim

wazoox · on Jan 23, 2018

This is relying on proprietary CUDA technology. This doesn't qualify as Free Software to me.

zer0t3ch · on Jan 23, 2018

Who said it was Free Software?

stmw · on Jan 23, 2018

This is great! I do wish this were written in something other than Python. What is the carbon footprint of all this computer vision, compute-intensive code still being run billions of times a day in Python? Someone should calculate...

minimaxir · on Jan 23, 2018

The actual computationally-hard part of the code is run in the GPU using CUDA.

inlined · on Jan 23, 2018

What was the carbon footprint of the turk machines the Python can replace?

stmw · on Jan 23, 2018

are you seriously comparing inefficient python code to humans?

croon · on Jan 23, 2018

If you choose to measure something in simply one dimension, don't be surprised (offended?) when someone else builds on your premise.

Humans are objectively terrible for the environment. Now you might start to argue other metrics instead, or that on an nth removed degree it/we might result in a net positive, but then you've abandoned your initial premise anyways.

chewxy · on Jan 23, 2018

Golang alternative being developed (by me and a bunch of others): https://gorgonia.org/gorgonia

stmw · on Jan 23, 2018

This is very cool, I hadn't heard of it!

stmw · on Jan 23, 2018

It is funny to see this comment get "-4" already... What's so offensive? After all, Facebook has rocksdb in C++, percona in java, and a PHP->C++ compiler, so they clearly have both the belief and the skill in moving away from interpreted programming languages for performance-sensitive code.

linkmotif · on Jan 23, 2018

For some reason, people are offended by gross misunderstandings. This framework is in Python, but it’s a Python binding that sets up code that runs natively (not even sure the details myself; others are writing CUDA). TensorFlow is the same way. It’s in Python, but the computations are not in Python. As you point out, that wouldn’t make sense.

stmw · on Jan 23, 2018

Having written Python -> C bindings before, I am quite aware of how this works... It still has considerable overheads.

linkmotif · on Jan 23, 2018

But these aren’t those kind of binding. Here the Python just sets things up, and that’s the end of Python. That’s my understanding.