To iterate on what others said, but what was not emphasized enough from my point of few:
AI is academic (as a synonym for 'theoretical' and 'math-intensive'). Once you look beyond purely symbolic AI, which proved to be infeasible as @curuinor pointed out somewhere here, you will need to build up at least basic knowledge in probability theory and linear algebra.
If you've never had any exposure to probability theory or statistics, I recommend having a look at the course "MIT 6.041 Probabilistic Systems Analysis and Applied Probability" taught by John Tsitsiklis at MIT (video lectures are available through YouTube and MIT OpenCourseWare for free). Both the course and Tsitsiklis' book are superb learning materials to get into probabilisitc thinking.
Strang's class is very pretty and excellent and just a little bit off from the center of the sorts of linear algebra used in machine learning. Not a lot off, but a little off.
A field that does inspire a lot of deep learning folks and never gets mentiond in this sort of thing is the theory of physical dynamical systems. Attractor is a term that came from here, for example, and much of the mathematics behind the numerical fuckery behind deep nets is dynamical in nature. RNN's are entirely dynamical systems. Classic there is Strogatz book (https://www.amazon.com/Nonlinear-Dynamics-Chaos-Applications...).
There is also information theory, of course, which is part of the MacKay source.
Many of the earlier papers in deep learning-land are really nontrivial to read, because the terminology and worldview of everybody has changed so much. So reading original Werbos or Rumelhart is really difficult. This is really not the case for Sutton and Barto, "RL: An Introduction" (http://webdocs.cs.ualberta.ca/~sutton/book/the-book.html). Two editions, apparently the second edition is basically getting with the program on shoving DL into everything.
Schmidhuber often mentions that Gauss was the original shallow learner. This is a technically correct statement (best kind of statement), but you definitely should probably know linear and logistic regression like the back of your hand before starting on DL too much.
To preface, I'm currently learning several disciplines in tandem along a route suggested by the link, so kudos to them for putting together a solid list of resources.
Now, from the link: "Few universities offer an education that is on par with what you can find online these days. The people pioneering the field from industry and academia so openly and competently share their knowledge that the best curriculum is an open source one."
On the one hand, it is true there are a ton of resources where the largest cost is the time it takes to go through the learning process. And I'm awestruck that research papers are so openly available and practitioners are so willing to share their knowledge to others both in posting their books as PDFs/HTML files and creating online courses.
On the other hand, how feasible is it for an individual to work on notable AI companies/projects without a Masters or PhD in a related field? Can that gap be crossed merely by becoming fluent in the various disciplines involved in AI, before contributing non-formally academic research/experiments you've conducted on your own?
The Google Brain Residency is a cool program for non-academics to get into deep learning research, and you can always get into AI on the applications side, but in both cases you're going to have to really try.
What is too much though? Backpropagation uses derivatives, some filters in Computer Vision use multivariate calculus. If you want to have a thorough understanding then calculus is necessary. That said, Andrew Ng was quite good at avoiding calculus in his Machine Learning MOOC, and for applied machine learning I guess calculus is not that important.
A great place to study about math is www.khanacademy.org, they have courses on calculus, probability/statistics and linear algebra.
Strang's complaint is that there's too little linear algebra. This is true. This doesn't overshadow the fact that you're not going to get out of using some partial derivatives in neural net land (and many other AI subfields).
AI is academic (as a synonym for 'theoretical' and 'math-intensive'). Once you look beyond purely symbolic AI, which proved to be infeasible as @curuinor pointed out somewhere here, you will need to build up at least basic knowledge in probability theory and linear algebra.
The path I'm following at the moment is a quite rigorous one and is outlined here (http://www.deeplearningweekly.com/pages/open_source_deep_lea...).
If you've never had any exposure to probability theory or statistics, I recommend having a look at the course "MIT 6.041 Probabilistic Systems Analysis and Applied Probability" taught by John Tsitsiklis at MIT (video lectures are available through YouTube and MIT OpenCourseWare for free). Both the course and Tsitsiklis' book are superb learning materials to get into probabilisitc thinking.
Edit: Link was broken. Thanks to @blauditore.