"“We realized that the neural nets did not encode knowledge necessary to produce an image of a fire truck, only the knowledge necessary to tell fire trucks apart from other classes,” [Yosinski] explained."
This seems markedly different from biological neural networks. Is the difference one of network structure/algorithm or rather the fact that biological neural networks (in human image processing) actually have time and space to learn a lot about each individual image class?
Deep nets are only loosely inspired by neurobiology. That's why LeCun calls them "convolutional nets" and not "convolutional neural nets" and prefers "nodes" over "neurons".
It is, however, possible to have a deep net produce 3D models/images: https://www.youtube.com/watch?v=QCSW4isBDL0 "Learning to Generate Chairs with Convolutional Neural Networks".
I also suspect a different part of cognition is used when humans are asked to recreate a "fire truck" than when humans are asked to classify a "fire truck" from a "car". The former seems closer to using memory ("what did the last five fire trucks I saw look like?"). A fairly recent addition to deep nets is making use of memory: http://arxiv.org/pdf/1410.5401.pdf "Neural Turing Machines". So the difference may quickly become less significant.
This seems markedly different from biological neural networks. Is the difference one of network structure/algorithm or rather the fact that biological neural networks (in human image processing) actually have time and space to learn a lot about each individual image class?