I think if either a project that FB uses upstream has Apache 2 or if you are a large organization (for example the Apache Software Foundation) you can use your clout to force Facebook to use Apache.
So is this the end of Google captchas asking for where the car/sign/whatever is? Will there be a final battle of AIs, where they will kill each other, and the unfettered access to websites over VPN/tor wins and laughs the last laugh?
For the car captchas, I've found actually clicking all the boxes with part of a car will always be a wrong answer (distinct from when it just makes you answer twice). Instead, you have to click on the squares that you know it thinks are cars.
This creates a twisted Turing test situation where, to prove you are a human, you have to pretend to be a machine's idea of what a human is.
It was the same when this was about words from old books. I always had to fill in letters the average person would have thought it to be, not what it actually was (e.g. the letter "f" for what really was an "s" in gothic type).
Nowadays it's much easier, you can click anything that looks vaguely the same (e.g. boxy things for cars, ads for traffic signs, traffic signs for store fronts etc.). The fact that it's so easy to poison the training set makes me very wary about the autonomous car future...
I actually like poisoning them. Not to be malicious but I feel manipulated into training their software for free. "Oh you wanted to sign up for that web forum? Sorry, but you have to do some free work for us first"
And if you think that it's somehow good because it's mutually beneficial to train AI to better the future of humanity, don't. That is what their marketing department wants you to think.
To use an in-thread example - online forums are created with free work. Should all forums be forced to make their archives available for free download as well?
I have more problems with bridges than with cars. The damn thing forces you to select 3 bridges, except that there are only 2... So you are forced to select what it thinks is a bridge, and confirm his erroneous bias even more.
Storefronts are difficult too, I don't think I'm ever good enough at those to satisfy it. The most reasonable one seems to be street signs, but I think it fails me for not flagging the unpainted back of one.
I had to do 20 or more of these to get some post tracking data recently. Found the same thing with most objects where a very small part of an object hadn’t been classified as containing that object.
Does it tell you that you're wrong, or simply give you another set. If it is just giving you another set, it may be that it thinks that you can provide more useful data.
If you fail, it tells you you were wrong (some red text at the bottom) and gives you another challenge. I think it will sometimes just give you another challenge for more data, but it won't have red failure text.
> This creates a twisted Turing test situation where, to prove you are a human, you have to pretend to be a machine's idea of what a human is.
Exactly. I think Recaptcha was better when it was looking for consistency with other human answers. Using "AI" has the same problem you mentioned, plus its more vulnerable because it has the assumption that your "AI" is unapproachably far ahead of competitors.
I went through exactly this today. Wasn't sure if it was the computer being dumb, or other people missing corners of signs or bridges etc. Still, takes me 5 goes every time.
The problem with the cars for me is their definition -- is that van classed as a car, how about the half-back, what about a 4x4, what about a 4x4 with no back windows, ...
What about a square that contains just a sliver of the car from the next box over? Does that still count? I hypothesize that my attempt to classify every pixel related to a car as a "car" may contribute to my failing of these tests.
This is what kurtisc is saying. The algorithm is unable to classify those pixels, so you have to guess which bits of the car can be detected by the algorithm and then select only those. So you have machines asking humans to think like a machine in order to prove they are human.
Isn't it supposed to learn from you? IE, answering technically correctly but what it thinks is incorrect is slightly annoying for you, but better for the system in the long run. (ie better for everyone)
Everyone? Or Google? I don't feel particularly happy about being used as a lab rat, so no, thank you - I don't care about the quality of Google's AI. If anything, I would purposefully mislead it if I knew how.
So we will get even more "targeted ads" in our faces? No, thanks. I think ML/AI has a great potential (especially in medicine), but I just don't trust ad companies to use it for any good cause.
Has Google published this training set somewhere? Until they do, you're absolutely right that this is a great way to build a training set, but I don't see how it's to anyone's benefit but Google's.
I find the same thing with the road signs : other people are giving (imho) the wrong answer by not including squares that cover a small, but non-zero part of the sign.
Of course if you start googling for street signs a lot, they'll hit you with a captcha for each search, so that you can't ask Google to solve their captchas for you until you solve their captchas.
Your https-everywhere is doing it wrong. I don't have an SSL on my personal site (no requirement to). I do have SSL on other domains that share that IP however.
The idea that you have no requirement to is wrong - your site may not have sensitive information but without SSL it can be MITM'd and used as a vector for malware, ad injection, etc.
You argument is valid (somewhat), but I don't think that I attract enough traffic on a constant basis, to invest the time and effort to SSL up all my domains.
So far I've had a less than stellar experience with letsencrypt, so I'm not quite ready to go all free-certs just quite yet. It also requires a rebuild of my web-server[1] which I've been putting off for a very long time already.
So long as your site doens't have SSL you shouldn't be linking people to it. You're under no obligation to drop everything and fix your web server security but you should really stop using it until you do.
I'm not expert on this but what I'm seeing is that port 443 is serving up a response when jaruzel.com is requested on that port. While you may not be actively advertising that domain with https URLs, it is valid for clients to request one speculatively.
There isn't a valid cert for that domain and for some reason for server is offering a different one. Presumably you need to unbind 443 from that host header name (this is based on memories of configuring IIS a decade ago).
"port 443 is serving up a response when jaruzel.com is requested on that port"
The only response is a 404, which is exactly what should be displayed (to the best of my knowledge) for a domain that isn't configured for that IP/port when there are other sites utilizing that IP/port.
I have an IP... that IP points to a router, that router port-forwards ports 80 and 443 blindly to a web server, on that web server is a bunch of websites. IIS knows which ones to serve to clients based on a) the host-header, and b) the port.
jaruzel.com:443 is not valid, but because I run an older version of IIS[1], that does not support SNI, the cert is bound to the port, not the host-header. As such any domain name that points to the IP will dump you at that cert if you try to connect on port 443.
Complaining about a non-existent SSL cert and then backseat driving the "fix", using words like "you need to", all based on shady memories of configuring IIS a decade ago?
This is some of the most advanced work out there - but CV is not “solved” most vision systems only can label about 1k categories of objects. So capatchas can still be easiy constructed that would fool these systems. Part of why it is exciting to get this out there others can help us improve it.
Imagenet is the only reason why most models are trained on 1K of categories. There's plenty of models in the wild that handle 10s of thousands of classes.
That’s impressive work. Still don’t think we have reached human level for all the categories of things we see in images. But you are correct that my comment about 1k categories is not true for many production systems.
Whilst there are plenty of things in CV that computers aren't super-human at yet, object classification (given 100+ examples) is not one of them. In datasets with tens of thousands of categories, humans are much worse than computers - e.g. humans are really not good at knowing the difference between every type of mushroom, algae, and model of airplane.
Further, nearly every time a computer recently has been trained to do some very nuanced classification, such as in radiology, they exceed human expert performance.
(Outside of classification, computers are rapidly making progress - for instance they are getting surprisingly good at predicting the next few frames of a video, which requires a lot of "world knowledge" to do correctly.)
Definitely not close to having things work for all categories. As you scale up to more categories ambiguity and specificity becomes an issue. Clarifai has a nice demo of their model which has >10K classes, https://clarifai.com/demo , the top predictions are usually correct but not always the most relevant.
I only linked to the xception paper because it mentions JFT. It's not state of the art for large scale recognition.
It's not just a matter of detecting objects and locating them. The deeper computer vision problem is to identify object attributes, relations between objects and actions in video. It's much harder to do that because many relations appear in very diverse situations, with objects of different categories, so it's hard to have 1000's of examples for each class of relation.
For example, humans can identify a monkey riding a Segway on the airport runway, but there probably is no such thing in the training set, even if it is quite large. The neural net might not know if that constitutes a "riding" action because it has never seen such a combination. Maybe the monkey is jumping over the thing and the picture shows it in proximity to it, not riding it - a human would know that a slight gap means there is no riding taking place.
Then, the even harder problem is to predict the consequences of actions on objects and just to physically simulate the scene. Such knowledge is useful in robot action planning. Beyond computer vision, there is also a need to create a "mental simulator" that has theory of mind and can simulate other agents (what humans intend), and we need simulators, both physical and mental to create the next level of AI.
The link only works if you are logged in, otherwise it says the page not found, which is wrong message because it makes you think it doesn't exists even if you login.
Google doesn't even show a catpcha if they have enough tracking info for toy to verify you're a human, which is pretty simple for them if you don't clear all your cookies for a few days. I'm pretty sure Google thinks if they have to show you a catpcha they've failed, but along with that they don't feel the need to make the catpchas particularly easy if they do have to show it.
Except when it doesn't. Every so often it suddenly spams one with sheer number of captcha on sites that use the Google's Recaptcha API. Then you have to solve detect the houses, cars, store-fronts, deceiver plates and street signs several times successfully to move on or be prepared to solve several more captches to be allowed to move on. I would wish Google fix Recaptcha. It used to be so good.
Are you using an adblocker or any other privacy extensions (PrivacyBadger, Disconnect, etc.) ?
If you are, Google will spam you to death with captchas; it kinda makes sense because captchas are getting easier to solve for machines, so apparently the new test of humanity is whether Google can track your activity on other sites.
Obviously I am 99.99% sure I am clicking correctly, yet Recaptcha displays me another and another captcha, until it lets me move on. Is it a system bug or maybe their bot detection got quite unreliable?
You can complain to the site operator. Google is not forcing their captcha on websites. If enough people do that, perhaps site operator(s) will take notice. Tell them exactly what you don't like so they don't eventually change to different vendor, but same annoying captcha tech.
I remember one particular set of pictures with both buses and coaches (and some with neither) on it, and it asked me to click on the buses. It said I'd got it wrong, but presented an almost identical set afterwards, and I included the coaches, but that was also wrong.
I've seen this happen, and I think it was because the IP address was previously used for craping of some sort, or somehow set off some flags at Google. VPN providers can cause this because sometimes their IPs are used for just such things.
I must be really unlucky in that I regularly get pretty challenging captchas which require a lot of tries before it lets me through. It's frustrating enough to make me avoid future visits to sites which use their captchas.
I take it you don't use a VPN. I often have to pass a captcha just to use google search when in an IPv4 address. (Considering that ISPs are now allowed to sell my data, using a VPN seems to be an obvious choice.)
It's annoying to the point that it's pushed me to use DuckDuckGo more often and I tend to avoid platforms that require me to continually take their captchas. I used Discord for a little while, but once it started asking me to verify my humanity again periodically per session, I booked it.
So, you're traffic is coming from some provider that purposefully obfuscates info and your traffic is mixed with a bunch of other people's? I can't imagine why they view you as less likely to be verified as a real person...
I'm not saying that it's an unreasonable assessment on their part, but it is a large annoyance.
My question is: are they doing this to simply get more training data for image classification, reduce server load by minimizing automated traffic, or to sanitize their queries for human input for NLP models?
> My question is: are they doing this to simply get more training data for image classification, reduce server load by minimizing automated traffic, or to sanitize their queries for human input for NLP models?
As someone that works in an industry where CAPTCHAs have historically played a large role, and some players flat our use technology to bypass them, and do so using proxy and/or VPN services to get good IP addresses to do so, I imagine those automated systems both corrupt the CAPTCHA system somewhat, since it looks like a large corpus of humans behave in a certain manner and it's not humans at all. It likely also causes those IPs to be considered by the CAPTCHA system as highly suspect whenever encountered.
For your next questions, the industry is event ticket resale, and no, we don't do that (there are aboveboard ways to function in this market that rely less on brute force and more on data mining and analysis for specific targeted investment, and sometimes long after it's been on sale).
The future is going to be your corporation/country/blockchain's AI vs your adversary's corporation/country/blockchain's AI with vast numbers of humans in the middle of the whole sh*tstorm just trying to survive and live a tolerable life.
That's good, though, I wouldn't want a future where a single AI has all the power. We need a multitude of AIs to increase equality for humans and diversity for AI.
my overall feeling as someone that wants to start getting into visual recognition is that there are a bunch of great libraries/ecosystems to choose from and all of them have pros and cons, but i honestly don't want to make the wrong decision and end up being stuck later on. Anyone here has any advise on what i should use to have a camera(rpi) recognize most common objects and then add a layer where we can teach specific objects, ie (putting a name on a person or a pet), thank you!
You should be set with openCV or JavaCV. JavaCV has more ported image/video progressing libraries than openCV and are a bit more customizable in my experience. And literally JavaCV has openCV natively but a different wrapper. I would only suggest openCV if you are working with python. It provides enough tools for basic then mid complex objects and if you want to mature in it then open up the hood. As for porting that pet porject to a mobile android device i would suggest to go straight with javaCV even though openCV has wrappers for android and java. The reason is because at some point you'll want to ditch their riggid methods of obtaining video/pictures. JavaCV has served me well when integrating with android. Also, if you eventually want to scale up the processing into a web service then it' easier.
Not sure if you're looking for CV or DL libraries. If you're looking for DL libraries, you can't go wrong with Tensorflow or PyTorch for research/development, and Tensorflow or Caffe2 for deployment. Tensorflow's a bit difficult to learn, but has a lot of great tooling around it, and PyTorch is the opposite. The other frameworks are fine, but don't have the same amount of documentation & beginner resources.
The first lesson is image classification ("is this a picture of a cat or a dog?"). Given that OP is commenting on an object detection library release, though, I assume they're interested in object recognition/detection/segmentation and rather than just image classification. So, more like: "what things are in this image and where are they?" or even just "where are the dogs in this image?"
That's also covered eventually in fast.ai, but not until the second course if memory serves.
This looks amazing from the computing point of view (and is an achievement of sorts), yet the confidence percentages are lower than what an average human might be able to solve for (like in the case of CAPTCHA tests).
Meanwhile, I wonder about the human costs if systems like these are adopted for purposes where they may be ill suited for, especially cases where their confidence scores are ignored (or mistakenly assumed to be 100% even when they're lower). Anyone have reading material on this?
My primary worry is law enforcement and government surveillance considering these systems as infallible and making judgments or life-changing decisions based on interpretations like this from computers. Computing has improved our lives a lot, but sometimes I feel there's an air of over confidence that clouds our judgment.
Does anyone know an alternative that works on RaspberryPi? This states: "Detectron operators currently do not have CPU implementation; a GPU system is required."
+1 for the Google object detection API. The trained model is quite huge though. 200 MB based on Resnet faster R-CNN. There are creative ways of chunking this model to keep it small.
Maybe https://pjreddie.com/darknet/install or one of the various forks. YOLO is supposed to be one of the best object detection systems as far as accuracy/speed is concerned.
Look at the first table in https://arxiv.org/pdf/1708.02002.pdf (the RetinaNet paper, code + models included in this release). There’s a substantial accuracy cost for YOLO vs other, similarly fast (within a factor of 2 or so), methods.
Unless I misunderstood figure 2, YOLO seems to be more than twice as fast than the second runner and yes I agree and already was aware that YOLO's accuracy is not as good as other classifiers.
It's a bit disappointing, when caffe2 was released it was stated that mobile is a big focus, but things like this don't support mobile (even though some of this was demoed by FB on a phone).
FWIW we have some similar R-CNN models using this codebase running in the FB apps on iOS/Android with Caffe2 - with some neat bells and whistles (eg full execution of the compute graph on the device GPU where available via Metal/OpenGL). I’ll look into adding a mobile tutorial for Detectron if you’re interested.
That would be awesome. But - is a tutorial enough? From what I gather from the readme the custom ops have cuda implementations and no CPU ones, which you'd like to have as a fallback on mobile (even if metal/opengl implementations exist), or am I wrong?
i ras running a NAS-based object detection model on my big-new MacBook and it was taking about 30s/image on an unoptimised tensorflow build. I then tried a model/net model which took about 3-4s/image.
> Beyond research, a number of Facebook teams use this platform to train custom models for a variety of applications including augmented reality and community integrity.
I would expect it to work on detecting the mismatching of content for types of communities in general. For example, preventing pictures of cats or giraffes being uploaded as a product photo on Poshmark when it's supposed to be a pair of shoes.
That type of check should become standard in a short amount of time for all communities that accept photos (that isn't meant to be general purpose, eg Imgur).
I can't wait for Facebook marketplace to fail. The constant stream of useless ads is obnoxious. I am using Facebook less because there is no way to block that low quality content
When could counterfeits be detected with a somewhat decent accuracy? Is it something that has been attempted?
I checked out Facebook Marketplace a few times since the launch and and everytime I'm just overwhelmed by the sheer amount of better or worse counterfeits.
On top of that I start getting notifications more counterfeits, for a short while I reported them but after a while it felt pointless and now I just ignore the marketplace tab.
CUDA needs driver support to talk to the GPU, and since it is proprietary nVidia technology, the open-source driver can't support it. So either you run the nVidia driver or you have to use OpenCL.
I had to specifically disable nouveau to get cuda to install correctly on ubuntu 16.04. You need an nvidia card and drivers that are new enough to run the later versions of cuda.
Is there a class of math problem humans can solve but computers cannot? Then we could just use these problems as a guaranteed test instead of the current CAPTCHA arms race.
There is no arms race. 99.9% of the time Google knows if you're a robot based on your browser state. They make you label the images because it's a free way to get training data.
I see this argument a lot and considered it the case to, but I have to wonder if that really is the case. Wouldn't google be empowering people to really mess up their training data?
If I'm trying to automate a system to fool their captcha, I'm probably getting a lot of bad results. Or I could just be intentionally feeding them bad data, the fact that not being allowed through captcha keeps letting me make more and more inputs would enable someone to do that as long as they would like to.
I believe they cross-check between different users.
I believe it used to work something like this on the old recaptcha system (the one with the words or house numbers): They show you 2. 1 is known by recaptcha, the other one isn't. If you enter the known one correctly, you can enter. The unknown one is presented to other users and when there is enough consensus amonst users it is promoted to a known one. So it's hard to mess with the system as an individual.
The thing you're missing is volume. Even if you assume the vast majority of people will attempt to mess up your data, when you have enough people doing it, you can look at them on aggregate and based on patterns disregard bad data. It might be "expensive", but still worth it.
Ive seen bot detection arms races in video games first hand, and there is a lot more to gain on webstes. Surely there are engineers smart enough to combat google's bot detection, and with massive sets of labeled data google stands a fighting chance.
Google has basically your entire browsing history, and search history. Google then compares that with what is considered normal for a human, and then lets you in if you pass that check.
Of course, if you run IRC bots that scrape Google with a headless browser to implement a .search functionality, and which offer link titling in IRC, and you use a separate bucket of cookies and IP for every IRC bot, your bots now also have a human search and browsing history, and also will pass all ReCaptchas...
I'm happy that tech companies are open sourcing basic research all the time, and thinking a lot about what would have happened if large pharmacy companies did the same thing. I'm just hopeful that with new biotech companies the science behind curing people will get faster as well.
> thinking a lot about what would have happened if large pharmacy companies did the same thing
There is a company creating a 3d-printed chemical reactor. By downloading a schematic and buying some raw substances, you can create your own lab. It can be used to synthesise drugs in remote areas, such as on Mars, or to make generics for cheap. The exciting part is that the reactor schematic can be downloaded and shared easily. It can also make illegal drugs just as easily as 3d-printers can print guns.
Unlike the case in tech, pharma basic research is far less important in advancing our knowledge when compared to academia. A good example comes from the last few blockbuster cancer therapies - CAR-T cells and checkpoint blockade all arose in academic labs.
Also, for drugs that do make it to market, efficacy and side effect information is published as a condition of drug approval, at least for new drugs.
Whether basic science research papers should be behind a paywall is a wholly separate issue, but the life science community largely shares its finished products. Indeed, there’s even a push to share early stage data, too.
This is great! I do wish this were written in something other than Python. What is the carbon footprint of all this computer vision, compute-intensive code still being run billions of times a day in Python? Someone should calculate...
If you choose to measure something in simply one dimension, don't be surprised (offended?) when someone else builds on your premise.
Humans are objectively terrible for the environment. Now you might start to argue other metrics instead, or that on an nth removed degree it/we might result in a net positive, but then you've abandoned your initial premise anyways.
It is funny to see this comment get "-4" already... What's so offensive? After all, Facebook has rocksdb in C++, percona in java, and a PHP->C++ compiler, so they clearly have both the belief and the skill in moving away from interpreted programming languages for performance-sensitive code.
For some reason, people are offended by gross misunderstandings. This framework is in Python, but it’s a Python binding that sets up code that runs natively (not even sure the details myself; others are writing CUDA). TensorFlow is the same way. It’s in Python, but the computations are not in Python. As you point out, that wouldn’t make sense.