I think it's a useful visualization, but I prefer matrix plots to observe the we...

I think it's a useful visualization, but I prefer matrix plots to observe the weights. You can see the weights start differentiating themselves as training proceeds, and you'll notice that some layers tend to learn a lot faster than others. The unit activations (neuron outputs) are similarly useful to visualize.

Example of weights on matrix plot: http://imgur.com/T48Wal1