Gram matrix, some non-linear optimization. Deep Learning is simple, complexity comes from loss function, trying to adjust weights of under-represented categories, different LEGO blocks in building your network and seeing if the particular non-linear optimization works in your case or not. You can go super deep with state-of-art math research in reading about "why do we think Deep Learning works, when it shouldn't", which is mentioned by Jeremy.