> My naïve understanding of deep learning is that it works by finding patterns in the answers, instead of actually solving problems.
I think this is quite profound and inspiring.
Although perhaps it is only one half of it. Concretely, deep learning finds patterns, the best patterns are derived from the highest bandwidth signal, often this is the input.
Geoff Hinton has argued that the task, solving the problem, is a low bandwidth signal.
Hinton aphorises [approximately] 'If you want to learn computer vision first learn to do computer graphics, i.e. a generative model.' - this is about the bandwidth of the data signal.
Hinton: [1] "Each image has much more information in it than a typical label... Each image puts a lot of constraint on the identity function. Whereas if I give you an image and a label and I try and get the right answer I don't get much constraint on the mapping from image to label. The bits of constraint on that mapping imposed by training example are just the number of bits to say what the answer is which is not very many."
I think this is quite profound and inspiring.
Although perhaps it is only one half of it. Concretely, deep learning finds patterns, the best patterns are derived from the highest bandwidth signal, often this is the input.
Geoff Hinton has argued that the task, solving the problem, is a low bandwidth signal.
Hinton aphorises [approximately] 'If you want to learn computer vision first learn to do computer graphics, i.e. a generative model.' - this is about the bandwidth of the data signal.
Hinton: [1] "Each image has much more information in it than a typical label... Each image puts a lot of constraint on the identity function. Whereas if I give you an image and a label and I try and get the right answer I don't get much constraint on the mapping from image to label. The bits of constraint on that mapping imposed by training example are just the number of bits to say what the answer is which is not very many."
[1] @5:28 http://videolectures.net/mlss09uk_hinton_dbn/