Complete Course on Machine Learning

rugatelstvo · on July 21, 2015

I am under the impression that to learn statistics one must first have a working knowledge of probability theory which rests upon grad level math analysis. Can machine learning be studied without any of that?

RogerL · on July 21, 2015

A lot of this is done in discrete math. You know, the actual probability is defined by this integral, but there is no closed form solution to the integral, so we do sums to find the approximate answer. Anyone can understand sums. And, it's probabilities, so the sums must equal one. Not that hard, right ;)

It sure helps to understand the integral equations, especially if you want to read the original literature. But realistically you are going to need to understand summing, normalizing, algorithms for clustering, and so on. You probably don't want to write your own numerical code anyway; someone else did it, and they handled all the edge cases that a naive implementation misses.

You can find PDFs of the James, Witten, Hastie, Tibshirani book "An Introduction to Statistical Learning" [1]. Scroll on through - there is nothing intimidating math wise. All the heavy lifting is left to R.

Jump in, the water is fine!

[1] http://web.stanford.edu/~hastie/pub.htm

huac · on July 21, 2015

If you're serious about the math, read "The Elements of Statistical Learning' instead. Same guys, just as much R code, but harder.

http://statweb.stanford.edu/~tibs/ElemStatLearn/

sarwechshar · on July 21, 2015

What online beginner machine learning courses or templates would you recommend that utilize R?

huac · on July 21, 2015

I don't really know what you're looking for. If it's a replacement for that Coursera ML class in Python, then I don't think there really is one. The basic tenets of ML aren't going to change depending on your language, though.

Some good R-specific resources:

http://www-bcf.usc.edu/~gareth/ISL/ https://cran.r-project.org/web/views/MachineLearning.html http://ocw.mit.edu/courses/sloan-school-of-management/15-097...

sarwechshar · on July 21, 2015

Thanks a lot for this! I didn't realize probabilities would be so important but I've been working with conditional expectations (not sure if it is relevant in machine learning) but it was an eye opener.

Another great introduction are the descriptive and inferential statistics courses on Udacity!

stdbrouw · on July 21, 2015

Conditional expectations are an important part of regression and in other scenarios where you might want to adjust a parameter estimate ("for every unit increase in x we get this much of a difference in y") for confounders. Generally, in machine learning, parameter estimates are not the (exclusive) basis for prediction, instead you put data in and a prediction comes out and what's in between is somewhat of a black box.

sarwechshar · on July 21, 2015

Oh cool, thanks for the explanation! And I take it what's in the black box is the key to figuring out how to make AI truly alive?

stdbrouw · on July 21, 2015

Well, technically we do know what's in the black box of course, it's just that for many methods it's not easy to summarize because there's so much happening under the hood. Leo Breiman (who invented random forests) gives some examples of how to do it, though: https://projecteuclid.org/euclid.ss/1009213726

btown · on July 21, 2015

Like most fields, it depends on your definition of "studied." If you want to push the envelope in theoretical non-applied research, you're going to want to learn analysis & measure theoretic probability theory. If you want to apply existing techniques, read (well-written) papers and code up the algorithms you find there, you can get away with undergraduate-level linear algebra & probability knowledge - Bayes' rule, expectations, independence, the general ability to think about random variables (and matrices thereof) as values that can be transformed and combined. And of course, you can fire up a classifier in SciPy without knowing any of this at all. But that's stretching the definition of "studied" quite a bit!

I personally went into a graduate-level probabilistic machine learning course with probability knowledge consisting of an undergraduate course that followed Ross http://www.amazon.com/Introduction-Probability-Models-Tenth-... - so there's certainly no need to have been a math major. But if you've never dealt with random variables whatsoever, you'll hit a wall following research from the last 20 years.

compbio · on July 21, 2015

There is applied machine learning (using machine learning to solve business problems) and theoretical machine learning (Optimization bounds, proofs, algorithm design).

With applied machine learning it is certainly possible to quickly get a working knowledge without too much reliance on statistics or difficult theory. You can compare this a bit with using a sorting function without knowing exactly how it works (but you know how fast it is and when to use it).

If you have an engineering background, take a look at the wide array of high-quality ML code and tools. Study trendy and powerful tools like XGBoost.

pvnick · on July 21, 2015

What do you mean, grad level math analysis? Much of probability theory can be learned with basic multivariate calculus. (Perhaps there's a terminology misunderstanding here - when I see "grad level" I think "grad school," ie masters/phd). Certainly basic probability theory is a plus.

rugatelstvo · on July 21, 2015

By grad level analysis I mean analysis based on Measure Theory. Here's where I got the idea(last comment in the linked thread): https://www.physicsforums.com/threads/what-is-the-most-usefu...

mturmon · on July 21, 2015

I agree with many of the responses here, that Math. Analysis (epsilon-delta proofs, continuity, etc.) is not strictly necessary for statistics. But...it certainly will help.

The dependency would look like:

  stats <- measure-theoretic prob. <- math. analysis

The problem with dumping the measure-theoretic probability is that you won't really know what a random variable is. It has a definition (a measurable function into the reals), and without that, you will have a tendency to think of it as "a box that produces something random when you look into it". This will limit your ability to understand papers, and will make you insecure in talking to people.

Besides "random variable", other common notions will also be hard to understand without measure-theoretic probability, like "almost surely", convergence concepts, the difference between the SLLN and WLLN, etc.

The problem with dumping analysis is that you will not know some basic things like what a continuous function is. What is everywhere continuous? What is a C1 function? And again, you will have a hard time reading and speaking.

For what it's worth, I found analysis to be not that fun, but measure-theoretic probability to be really a fun, tight, theory. It was enjoyable to learn.

huac · on July 21, 2015

Measure theory being necessary to statistics is rather contentious; a better discussion is on Andrew Gelman's blog [1].

My school's PhD stats program does require real analysis before the prelims, but for most intents and purposes, 'multi' and 'linal' (as the cool kids say) should be sufficient for machine learning from a comp sci perspective.

I haven't fully worked through ESLR (Hastie and Tibsharini's advanced version of ISLR posted above) but the majority of the math there is linear algebra with some differential equations and calculus thrown in. I've heard Harvard Stat 210 and Berkeley Stat 205A/B cited as good examples of mathematical stat classes - if you're seriously interested maybe take a look at those syllabi.

[1]: http://andrewgelman.com/2008/01/14/what_to_learn_i/

solomatov · on July 21, 2015

I think, you can get away without mathematical analysis (however, it's not that complex). However, basic probability theory is a must have.

eli_gottlieb · on July 21, 2015

Elementary (ie: math-speak for "undergrad-level") probability theory is quite accessible to someone with only a computer scientist's math classes. You really don't need real analysis until you start reading research papers on probability and they drop down into measure theory for this-and-that.

misiti3780 · on July 21, 2015

i would say you can can certainly get away with applying machine learning techniques without knowledge of probability theory, but if you want to do stuff like compare models, compare results, determine accuracy of your model, etc., you are going to quickly have to dive into basic statistics (bayes + frequentist)

gamapuna · on July 21, 2015

Here's the complete course: http://alex.smola.org/teaching/cmu2013-10-701/

ojaved · on July 21, 2015

This is the 2013 course. The original post is for the spring 2015 machine learning course

bcheung · on July 21, 2015

The videos don't seem to work.

joshvm · on July 21, 2015

Some nice courses there, also check out Dan Cremer's lectures on variational methods for computer vision if you're interested in that sort of thing. There's also a nice series on computer vision for special effects.

http://www.computervisiontalks.com/variational-methods-for-c...

anacleto · on July 21, 2015

That's really nice. Dan Cremer is impressive.

Here's a great Laboratory on Amazon ML for Human Activity Recognition (w/ Python). https://cloudacademy.com/amazon-web-services/labs/aws-machin...

Totally worth a look.

yla92 · on July 21, 2015

A bit off topic : what are the best recommended way/resources to learn linear algebra and basic probability and statistics ?

stdbrouw · on July 21, 2015

For linear algebra, I like the "No BS guide to linear algebra" (https://gumroad.com/l/noBSLA) which also includes a high school math refresher for people who need it (I did).

For probability, "Probability Demystified" is a good basic intro.

For statistics, I would really recommend Allen Downey's Think Stats (http://greenteapress.com/thinkstats2/index.html), especially if you're coming from a programming background. Most introductions to statistics focus heavily on the mathematics needed to enable certain analytical approximations to difficult probabilistic calculations (e.g. the t-test), whereas Think Stats just bites that bullet and focuses on simulation / brute force so you can spend more time on the actual fundamental theory behind statistics.

Brian Blais' "Statistical Inference for Everyone" (http://web.bryant.edu/~bblais/statistical-inference-for-ever...) also looks really good, but haven't had a chance to review it in depth.

delluminatus · on July 21, 2015

Depends on how basic you're imagining. Khan Academy [0] is a fairly well-regarded free resource for high-school and undergraduate level mathematics video lectures. They have probability and statistics as well as linear algebra courses.

If you prefer textbooks, I have heard good things about "Linear Algebra Done Right," [1] but I would not recommend it unless you are "math literate" at an undergraduate level already.

[0] https://www.khanacademy.org/math/probability

[1] http://www.amazon.com/dp/3319110799

smockman36 · on July 21, 2015

I think this is a good resource to start: http://betterexplained.com/articles/linear-algebra-guide/

id_ris · on July 21, 2015

For linear algebra check out Prof. Gilbert Strand's course on MIT OCW. He's great at explaining the material and the course resources are comprehensive.

http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-...

chrisdbaldwin · on July 21, 2015

Some of the videos in the link are cut short, and the full videos are much better. Here's a link to a playlist of the full lectures: https://www.youtube.com/playlist?list=PLZSO_6-bSqHQmMKwWVvYw...

ojaved · on July 21, 2015

The videos on computervisiontalks.com are exactly the same as videos on youtube because the site is pulling these videos from youtube. The post points to the spring 2015 lectures. You are pointing to earlier lectures in 2014,2013

phunehehe0 · on July 22, 2015

Just want to shout about this very comprehensive course by Caltech professor Yaser S. Abu-Mostafa http://work.caltech.edu/telecourse.html

btown · on July 21, 2015

I like his teaching style, but it seems some of the lecture videos (1.3, for example) are cut off - very frustrating! For anyone watching nonetheless, I recommend going into YouTube and changing the speed to 1.5x.

ojaved · on July 21, 2015

The lectures on computervisiontalks are directly being taken from youtube (but tags, navigation, bookmarking and in-video search capability is added). The lecture 1.3 (for spring 2015 class) is exactly of the same length. However on youtube, the lectures for machine learning class 2013 (also by Alex Smola) are available which are of a different length.

ericmo · on July 21, 2015

I like Smola's ML book, and it's great to see a full-depth ML course online, I'll certainly watch some videos. Other than that, the audio quality could be better.

zablocky · on July 21, 2015

Have anyone seen some and can tell if the material is well explained?

smilekzs · on July 21, 2015

I took the course (Alex Smola's 10-701, Spring '15) in the classroom. Personally I don't like his lecture style -- too vague, too many assumptions, too much reliance on jargon he hasn't already explained. YMMV.