How Google Can Open Up Image Labeler To Entrepreneurs And Accelerate A.I. Progress

Google image labeler http://images.google.com/imagelabeler/ allows people to play a game labelling images, thus give data towards their machine learning research on image recognition. Why don't they open this up to entrepreneurs ? (or amazon, yahoo, or some startups take up the opportunity to do the same)

What I'm suggesting is that a "gold rush" type situation form around the building of classifiers (a type of machine learning algorithm), such as that occurred with website creation. There are millions trying to get an income onlne creating content for adsense and affilate programs (and for free with wikipedia and open source). Most will fail at generating reasonable income but in the process a large amount of wealth has been created.

The same could be done for classifiers/artifical intelligence. For each classifier it would require collecting a thousand or two pieces of data (image/sudio/text) to create a classifier, an effort similar to creating a small website. For each use of the classifier at customer would pay, say some fraction of cent (or whatever the market will bear).

I'm trying to draw a parallel between how content on the web is being created and how an artificial mind might be created. If you sat down at tried to engineer a giant encyclopedia within one company/community, you would end up with britannica (within one company), and wikipedia (within a community of enthusiasts), However, the web at large is much larger than both of these because of the economic incentives for people to create articles. Many people try and fail (in effect working for free), but the progress is rapid, as in evolution. In the same way, incentivizing people to produce classifiers would have the similar effect of rapid progress.

These classification algorithms are already available, boosting, random forests, support vector machine, and it would require some companies to host them and make them easily available so that ordinary people can start using them and get the ball rolling (how blogs and adsense made it easy for people to easily publish and monetize their content).

I might collect 1000 images of cats, send it to a classifier host that would then apply support vector machine to it for 2 hours, charge me 40 cents (at 20 cents per hour of usage of one cpu), I might then send the created classifier file to another host (whose business is to host classifiers) and set a rate of 1000 uses for 1 cent lets say (I might receive 80 percent of this, and the hoster 20 percent). Then anyone who wants to identify if there are cats in their image would connect to and use the cat classifier, paying me a fraction of a cent each time.

For any given image I would probably want to apply thousands of different classifiers. Somebody might have created a service that aggregates 2000 diffferent animal classifiers and I would send my image there and recieve results for a fee (each of those classifier creators would receive a sum of money, as would the aggregator).

Initial applications for this include surveillance technology and is already in use. A much larger market is robotics. http://en.wikipedia.org/wiki/Video_Analytics

If you think of intelligence as being the ability to predict accurately, having a giant web of classifiers that predict accurately could be construed as a form of artificial general intelligence, or a human type artificial intelligence. Having enough data about the world, because you have created many classifiers that can recognize the semantic events in video and speech, would allow you to make all the same types of recognitions and predictions that a human would make.

Machine learning and computer vision open source tools http://torch5.sourceforge.net/ http://torch3vision.idiap.ch/ http://opencvlibrary.sourceforge.net/

To see where this might lead see Hans Moravec's page at CMU http://www.frc.ri.cmu.edu/~hpm/talks/robot.evolution.html http://en.wikipedia.org/wiki/Technological_singularity http://en.wikipedia.org/wiki/Post_scarcity

For more info see discussions at artificial general intelligence http://www.mail-archive.com/agi@v2.listbox.com/msg12876.html and news.ycombinator.com http://news.ycombinator.com/item?id=271202

For some initial applications see http://www.cs.ubc.ca/spider/lowe/vision.html http://www.gslis.utexas.edu/~palmquiscoursesproject98comvision/facerec.htm

For current research along this line see http://www.idiap.ch/demos.php http://labelme.csail.mit.edu/ http://www.cs.washington.edu/research/imagedatabase/ http://research.microsoft.com/vision/cambridge/recognition/ http://www.research.ibm.com/VideoAnnEx/