Hacker News new | past | comments | ask | show | jobs | submit login
All Models of Machine Learning Have Flaws (2007) (hunch.net)
57 points by walterbell on Jan 24, 2016 | hide | past | favorite | 17 comments



The ressurgence of AI is just as hyped as expert systems and logic computers in the 1980. I just hope the inevitable AI bubble crash is not as damaging as it was then.


>> The ressurgence of AI is just as hyped as expert systems and logic computers in the 1980.

Yep, for instance see how recent results are advertised as showing that machine learning algorithms are "better than humans" in various tasks (image recognition, primarily).

The thing to keep in mind is that all that old tech did not "fail" as is often assumed. Expert systems worked perfectly fine, as did a host of other GOFAI techniques. It's all waiting for us just as the researchers left it when they jumped overboard to fields that still had funding, kind of like a Mary Celeste of science.

My hope is that rather than going through another AI winter we can find a way to pick it all up where the 80's and the 90's left it and advance the science further, absolutely so while keeping the best bits of the current machine learning explosion. Machine learning is not a new thing anyway- it too goes back in the early days of AI. We have more data and more computational power today, but there's practically no really new approaches - Deep learning, SVMs, Decision Trees... those are not really "new" as such, it's just their popularity that is a new thing.

So we're at a point where we need to learn our history, before we end up repeating it unwittingly. And we can- we have theoretical foundation, we have practical approaches, we can put the two together and win.


Yeah, I'd love to see something like a recurrent neural network that builds the rules behind an expert system. Then hopefully the rule set that's built can actually be debugged and tweaked by humans in a meaningful way, but it's first implementation is automatically generated.


This could be achieved by following the edges of a decision tree from the root to all leaf nodes [0]. Results are not always easily interpreted by humans.

[0] http://www.saedsayad.com/decision_tree.htm


I can't state this strongly enough: machine learning is not hype. It has been going strong longer than you may realize, has been generating more sustained revenue than you may realize, and has been disrupting (ugh sorry, I hate that term) more industries than you may realize.

I started my career working at company that built its first machine learning based systems in the mid 80s. Very successfully, too: their early products are ubiquitous today, and probably will be for a long time. You almost certainly indirectly use them several times a week without realizing it if you don't leave your money under a mattress (I don't want to go into too much detail because it's a small industry, but it's related to banking and finance).

Google is, at its heart, a machine learning company that branched out as it grew. Pagerank was developed more than 20 years ago, and transformed the internet; can you imagine using the internet without it and its successors?

I could go on, but the fact is 100s of billions of dollars of value in the economy are a direct result of machine learning or were enabled solely due to machine learning. And this has been true for a long time. Sure, it's very popular right now, but it's not going away any time soon. Calling machine learning hype is like calling the internet hype because of the 1999 internet bubble.


I've often heard fans of AI[1] claim that a benefit of machine-based evaluations is the removal of human bias. The machine applies the rules consistently, while humans allow race, sex, religion, politics, eye-color[2], etc to bias their decisions, often without knowing the bias even exists.

This sounds great at first, but it's really just moving the problem around. Instead of many individuals making their own biased decisions, we get a uniform application of the biases of whoever designed the AI. They still may not realize how their biases influence the design process, so it's easy to think that the AI is somehow "neutral".

This is particularly bad because it removes locality from the decision process. While the fallible humans sometime inject their own racism/sexism/etc, the freedom to make those mistakes also gave them the power to fix problems that were never included in the "official policy".

Decision by remote algorithm is really just another type of central planning, and I expect it to have the same inability to respond to local conditions that cause a lot of the problems in the traditional communist states.

[1] usually combined with a new type of data collection (surveillance-as-a-business-model)

[2] biases that are often taught socially: http://www.pbs.org/wgbh/frontline/film/class-divided/


Machine learning systems are only as biased as their training data and their construction. It doesn't even have to be overt. Like when Google Photos tagged photos of African-American users as gorillas[1] -- sure, the machine learning algorithm wasn't racist in any recognizable sense, but it reflected the bias of the training data, which didn't have enough African-American people in it to let the algorithm learn about them.

1) http://www.theverge.com/2015/7/1/8880363/google-apologizes-p...


> training data and their construction

That's what I'm talking about. Machine learning systems - and everything else that is the product of human decisions - are a reflection of the biases, intentions, and other qualities of the designer. Even the initial choice to use a machine learning system is influenced by fallible humans.

Why use machine learning? What kind, and why? What is the underlying model that is being optimized? What data was it trained with? Is the design of the system a proxy for something else, intentionally or unintentionally?

The real world is complicated and messy, and any model of it is always gong to be incomplete.


>Machine learning systems are only as biased as their training data and their construction.

The Bias-Variance Tradeoff means that all machine learning involves bias. Without it, you couldn't learn.


I've often heard that benefit mentioned, but not at all in the context of AI. Rather, it's usually in the context of applying any kind of statistics or data analytics to a field previously devoid of it (e.g. Moneyball).

I don't understand why this would be a benefit of AI.


https://media.ccc.de/v/32c3-7457-predicting_crime_in_a_big_d...

If you're still interested, this talk on "predicting crime" specifically cites supposedly "unbiased" policing as a benefit of their new machine learning products.

It all sounds like snake-oil, subconscious prejudices made into models, and confirmation bias. /sigh/


What were the highly profitable industries built around expert systems and logic computers in the 1980s?

The fact that AI seems to be proving itself with real-world, commercially interesting results in the fields as disparate as advertising, facial recognition, and self-driving automobiles makes me think the comparisons with the 80s are premature.


>> What were the highly profitable industries built around expert systems and logic computers in the 1980s?

I'm not sure there were any industries built around expert systems, as such, but don't forget that if there's a practical application of an AI system, it's not considered AI anymore.

Accordingly, expert systems are widely used in the industry today still, first of all because there's tons of legacy software created in the 80's and '90s that is still lying around and being maintained. But also because the definition of an expert system is a database of rules encoding expert knowledge about a given domain and an inference engine to select the appropriate rules in any situation- and that is exactly what most entreprise applications are that manage business rules, for instance fraud detection at financial orgs etc (which I'm familiar with).

And of course there's industries where you can't afford to use a machine learning system to do your work. Think of jet engine maintenance for instance. You don't want a system with 10% error doing that. So I'm told that this is the kind of job that the industry, even today, leaves up to a good old-fashioned expert system.


> highly profitable industries

Those profits are not necessarily from AI.

> advertising, facial recognition

Yes, surveillance capitalism is highly profitable. AI may be enabling and magnifying that problem, but a lot of the profits at the moment are due to technology making it a lot easier to obscure business models and the difficulty in challenging "too big to fail"-sized accumulation of wealth.

The point being that while AI has some really interesting uses (with more being discovered regularly), it's profitability is probably somewhat closer to the fad tech of the 80s. It's the internet combined with very cheap, portable computers - and the shady business models they have created - that are highly profitable.


>> highly profitable industries

Something else. There's a problem with taking profitability as a measure of the success of an AI technique, or anything really, if for no other reason than that profitability is transient, and much more so in technology where today's cutting edge is tomorrow's dull blah. Whether a given technology is successful or not has nothing to do with whether it works well or not; it has everything to do wth whether the way it is marketed works or not. Otherwise MS Windows would not be the most widely adopted OS, yeah?

In the end, the only thing that having a profitable industry shows is that you have a profitable industry, nothing else. So some folks make money with machine learning systems and they hype them a lot. That's just marketing. If you understand the technology you realise there's a big difference between the hype and the actual hands-on results.



A comment on that article references http://www.i-programmer.info/news/105-artificial-intelligenc...,

"Early in 2014 a paper "Intriguing properties of neural networks" by ...a team that includes authors from Google's deep learning research, showed that neural networks aren't quite like us - and perhaps we aren't quite like us, as well. What was found was that with a little optimization algorithm you could find examples of images that were very close to the original image that the network would misclassify. This was shocking ... some how complex classifiers such as neural networks create classification regions which are more complex, more folded perhaps, and this allows pockets of adversarial images to exist."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: