This is a very interesting perspective on neural networks, is it novel? I've nev...

gajomi · on April 9, 2014

There is large body of work from the mid 1980s and early 1990s that addresses the question of hardness, sensitivity and robustness from various statistical physics/computer science collaborations. Of course, it is still ongoing, but that was a big period. The basic thrust of this program is to take methods from statistical mechanics and use them calculate not just worst case complexity, but also average case, best case, and everything in between. It is hard to think of a reference to end all references, but as far as books go you might want to check out: http://www.amazon.com/The-Nature-Computation-Cristopher-Moor.... For an article that briefly addresses some results related to kSAT there is: http://www.cs.cornell.edu/selman/papers/pdf/99.nature.phase.....

tlarkworthy · on April 9, 2014

I have seen Riemann geometry applied to neural networks. Basically the state space is curved, so you can speed up learning if take the curvature into account when sampling in the space of possible neural nets. (you want to explore the hypothesis space evenly)

Information geometry for Neural networks, 1998, Daniel Wagenaar:-

http://www.danielwagenaar.net/res/papers/98-Wage2.pdf

ejenk · on April 9, 2014

I think the author has confused NP with NP-hard (or NP-complete). Indeed, it is extremely unlikely that the unknottedness problem is NP-complete, as it is already in NP ∩ co-NP (assuming the generalized Riemann hypothesis). Furthermore, we know that the problem of computing many knot invariants is "only" BQP-complete, so it's possible that unknottedness is in BQP (i.e., efficiently solvable by quantum computers), though I don't know the expert opinion on this.

ot · on April 9, 2014

Indeed I read that as NP-complete. That's very interesting, do you have any references on it? Even if knots are ruled out, there might be hard topological decision problems that can be reduced to linear separability under homeomorphism, maybe in higher dimension.

tgflynn · on April 9, 2014

I think machine learning boils down to 2 fundamental issues:

1) Having a model class that is powerful enough to represent the data.

and

2) Being able to find a "good enough" optimal solution over that model class.

In principle over-fitting can be addressed through the choice of the particular optimization criteria in (2).

Neural networks are used because they are powerful enough to represent many data sets of interest and because there exist good algorithms for finding LOCAL optima.

Unfortunately they are complex non-linear, non-convex functions and hence finding a global optimum is most likely NP-hard.

We are left then with the heuristic of hoping that the local optima we are able to find are "close enough" to the global optimum to meet our needs.

This seems to work reasonably well for certain types of problems, like image classification, where the underlying data possess a certain simplicity or "smoothness", but less well for other problems, like logic problems where good and bad solutions may not be "near by" in any sense that the local optimization algorithms are able to discover.