Hacker News new | past | comments | ask | show | jobs | submit login

That's not actually how artificial neural networks in pattern recognition tasks really work.

Once you learn and understand the learning algorithms, you will experience this Wizard of Oz moment, you see how the trick's done and it's not quite as impressive as you hoped. This is confirmed by the fact that it exactly matches the limitations of a typical neural net you've established experimentally.

The tasks that are within these limitations, they perform quite well at, but it's not that hard at all to understand how they are performed. In most cases you can basically run the algorithmic equivalent of a trace debugger on the system. In as far as you can compare it with a brain (but don't do that, it's the wrong analogy--compare it with statistical regression instead), it's one you can take apart entirely, reassemble, run partial components of, tweak, and understand exactly how it works to the instruction level. Plus you can do these things automatically for the most common cases.

In short, there is nothing "magic" about neural networks, or any pattern recognition algorithms.

And to get back on the subject, if you intend to work with machine learning algorithms in a professional/consulting/project setting, probably best to first check the scientific literature to see if anything like this has been done, and what the accuracy/false positives/negatives/ROC curve is that you can expect. Don't expect to beat them, either (you may, but planning to do so by any significant amount is asking for trouble--especially if it turns out it just can't be done). Then, I think it would be clever to first build a "fake" system/interface, without the ML component, limited to just the labeled training examples (if you can't acquire these, give up now[0]), see if the system is actually useful from a software engineering, application and usability perspective, before you actually begin (the fun part) tweaking an ML algorithm, only to later turn out, even if it did its job perfectly, it wouldn't be all that useful in the operating environment in the first place.

This is different if you work for a corporation with a huge research budget, of course. Like Microsoft or Google. But even then, I think that for their ML-enabled end products, similar reasoning like I sketched above is used. Of course the same goes for every type of project that depends on an unknown to-be-researched technology.

[0] yeah yeah unsupervised learning. but if you know what you're doing that well, why are you listening to my advice? :)




I think this is wrong. Take for instance the task of classifying images. You can train a RBN with backprop (after contrastive divergence alg) to correctly classify images. In the process it has automagically determined properties of the image which allow it to perform the classification. These properties are combinations of pixel elements. So it has in effect determined how to solve a problem without your input. In a similar way, solving a set of simultaneous equations using any of a huge array of mechanical mathematical techniques is also solving a problem which you do not personally know how to solve. you could even consider using a Library of code as solving a problem you do not know how to solve...


"In the process it has automagically determined properties of the image which allow it to perform the classification."

(Emphasis mine.) But that's the point; it may be "auto", but if you understand how NNs work it's not magic. It's not even all that hard to understand (considered broadly), and once you understand how they work it is, for instance, easy to construct cases they fall flat on....

"So it has in effect determined how to solve a problem without your input."

... and it's less "auto" than you think. It figured out how to solve a problem based on your input of sample cases. And there's a certain amount of art involved in selecting and herding your sample cases, so regrettably you can't discard this part, either. Just flinging everything you've got at the NN is not going to produce good results.

If you don't understand NNs, you are unlikely to get good results by just flinging data at them; if you do get good results, it's probably because you have a problem that could have equally well been solved by even simpler techniques. They're really not magic.


It's classic emergent behavior. While you may understand how the algorithm works, and even be able to step through and see how each neuron affects the whole, that doesn't mean you know why the answer is correctly achieved through all of them combined.

The classic example is facial recognition. Training a neural network for facial recognition will result in lots of neurons contributing a very small part of the whole, and only when all (or most) are involved is the answer correct.

To most people, this (emergent behavior) is "magic".


But it's not emergence. Using Gaussian elimination to solve a gigantic system of equations isn't emergence either, even if the numbers are a bit too many to carry in your head at once. (as a matter of fact, solving systems of linear equations is part of the RBM training algo)

And even then, if it was emergence, doesn't automatically imply we don't understand it or that it's "magic". The famous Boids flocking simulation is a classic example of emergence. It's not very mysterious. Yes large-scale behaviour emerges from simple rules, this is amazing that it happens, but it doesn't hold up a barrier for us to understand, analyze and model this large-scale behaviour. Crystallisation is emergence, again we model it with a bunch of very hard combinatorial math.

But in this case, neural networks are not an example of emergence. They are really built in a fairly straight-forward manner from components that we understand, and the whole performs as the sum of the components, like gears in a big machine.


Understanding NNs is not the same as understanding "how they do it". You may have a good understanding of how the algorithms work, but after training your moderately sized network to do what it does, it's not very easy to know what _exactly_ it does to solve your problem.

I think that's what chii was trying to point out; not to say that devising them and training them is incomprehensible.


I have an upcoming project where we'll be implementing a pattern recognition algorithm - think smart flash cards. I've read up some on timed-repetition and a small amount on how neural networks work.

With your experiences, do you have any recommended reading that would help me with the neural network learning curve?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: