The one thing I see over and over and over again, in these articles and associated discussion, is failure to recognize the distinction between "ML research" and "applied ML". You always get people in the threads yelling "you can't do ML without graduate level knowledge of measure theory, real analysis, information theory, ..." and so on, ad infinitum. And that is very likely true for most ML research. But OTOH, you can absolutely take off the shelf libraries, the level of knowledge you would get from, say, taking Andrew Ng's Coursera class, and then create value for your company using ML. No measure theory or linear algebra needed.
And of course all this happens along a spectrum from "requires the least theory/math" to "requires the most theory/math". It's not a strictly binary dichotomy.
So is the person completing the fast.ai course going to be the inventor of the next great new NN training algorithm? Maybe not. Probably not, even. But are they going to be able to apply ML to solve real problems? Yeah, most likely.
Fast.AI is the best course out there if you had like to get in to applied side of deep learning. However, I wish people would be more careful about their claims. It's one thing to mechanically apply same technique to same problems. It's quite another thing to compose known techniques in a new way to solve new problems. And still quite different to invent brand new techniques to solve previously unsolvable problems. The first category is "practitioner", second is "engineer" and the third is "scientist".
To transition from practitioner to engineer, one will need to know low level details including directly relevant math and popular prior art. One of the distinguishing difference between a practitioner and an engineer is that later needs to be able do systematic debugging to identify and fix the problem. I often ask folks during interviews how they would go about when their ML model to do X doesn't work as expected. This usually reveals the difference between practitioner and engineer with high accuracy.
To transition from engineer to scientist, one needs to know much broader body of prior art and beyond directly related math. A typical interview that would differentiate between engineer and researcher would be to discuss some known unsolved problem and see if the person can elaborate details of prior art, opinions of top researchers, wins achieved so far and directions for future efforts.
This is again not to say that practitioners can't do debugging OR that engineers can't make ground breaking innovations. Faraday, Edison, Wright Brothers - all either started out as practitioners or engineers and successfully attacked some of the hardest unsolved problems of their time, never obtaining formal academic credentials. On the other end far many more academics have broken new grounds then non-academics. At the end of the day its all probabilities. Claiming that PhD is no longer required to do "AI" doesn't reveal all these subtlety and marginalizes tools and foundation that PhD equips oneself on this journey.
Andrew Ng's course definitely requires some linear algebra knowledge, and explains how things work under the hood.
Same goes for fast.ai
I think even that level of knowledge is not always necessary. Often, just having the intuitive understanding can be enough to get great results e.g: understand that this is a black box that takes features in, predicts things, and you have to help it by giving good features; or understand that word2vec builds vector representations based on words that co-occur in the same context.
Speaking of learning/reviewing linear algebra, I wrote the NO BULLSHIT guide to LINEAR ALGEBRA which covers all the material from first year in a very concise manner.
Nice! Did you work though some of the exercises and problems? Don't be a tourist in the land of math!
If you don't want to bother with pen-and-paper (which is the best, but takes very long), you should at least try some problems using the computer-assisted approach. See samples solutions to first few chapters: https://github.com/minireference/noBSLAnotebooks
>Andrew Ng's course definitely requires some linear algebra knowledge,
Extremely basic. Anyone with an undergrad engineering degree will have covered what's needed for the course. His course was almost patronizing, calling people who knew the basics of calculus as "advanced students".
I got halfway through his course and it seemed to do a decent enough job of teaching it as you go. the main thing I struggled with was the mathematical notation.
Throwing ML at something and seeing what sticks is how one ends up with systems that no-one understands and which have horrible impacts in the real world.
Is such a course enough to be able to discern what data to best collect for this and that ML model?
For any arbitrary range of values for $DATA_TO_COLLECT and $ML_MODEL? Maybe not. For many real world scenarios? I'd say yes. Heck, sometimes the data you have available simply "is what it is" and there is no question of "what data to best collect".
Let me also add re-iterate that learning to use ML is a continuum... and I would not posit that one could take Andrew's class, or the fast.ai class, and claim to be "done" with their ML education. To me either (or both) of those is just step one on a journey that could continue more or less forever (or until ASI comes along and enslaves the human race, ala The Matrix).
Personally I took Andrew's (original) Coursera course, but since then I've continued to study Linear Algebra and Calculus through other forums, have been working through several other ML books, and have a whole litany of stuff queued up to study, up to and including the aforementioned Measure Theory, Complex Analysis, etc. I definitely value that stuff, I just think you can accomplish some useful tasks before you get to that level.
I often make this point (if you don't have a grad-level understanding..). And people get pissy. Same with statistics ("if you don't have a PHD in stats; you don't understand stats").
--
I had someone contact me for a "consulting" position, to replace their rule-based insurance claim denial system, with "ML". Contract-work, nobody in-house to review its performance, no insight to the problems of the domain and they wanted to throw ML at it to "solve their problem".
When I refused, and pointed out that it would be highly irresponsible, to "contract-out" this type of work, particularly given its life-or-death implications; the CEO got angry. Told me; its "post-claims processing" and it isn't life or death. To which I told him bullshit; your denial of claims is going to directly influence how doctors practice their treatment. There is a direct feedback cycle. The fact that you don't see it, makes it even more dangerous.
They simply, didn't understand, how HIGHLY inappropriate it was to just throw random ML at a problem, particularly as a one-off consulting project (and no in-house expertise).
^ thats real-world.
Personally I don't think you should be allowed anywhere near ML, UNLESS you have that PHD in computer science. I don't even think you should be allowed to HIRE people for ML until you fully understand the hazards with letting a computer control critical decisions.
So yeah, with respect to your distinction of "ML research" and "applied ML".
> Personally I don't think you should be allowed anywhere near ML, UNLESS you have that PHD in computer science
A lot of ML research is done by PhDs in other fields. I did research that focused on developing compact group invariant features (for neural networks) for predicting local atomic energies in materials science, and a few mathematicians I follow did work on developing convolutional neural networks that utilize Clebsch–Gordan coefficients to generalize the translational invariance to other symmetries.
On the contrary, a lot of CS machine learning research is heavily application focused (generate such-and-such new thing using a GAN). If anything, mathematicians are the ones who understand machine learning at the very deepest level of theory. This isn't to say there aren't many theory-focused CS presentations/publications; I'm just refuting your point that highly theoretical machine learning research is purely the domain of CS PhDs.
This seems like it misses the point. Obviously there are professional numerical analysts, statisticians, mathematical physicists, etc. who have sufficient background and interest to keep up with cutting edge research and do solid work in machine learning. The argument is not that everyone needs a CS degree per se, but rather that you shouldn’t have your excel guy who just went through a machine learning MOOC but has no further training or deeper understanding try to apply machine learning to life-or-death problems.
shouldn’t have your excel guy who just went through a machine learning MOOC but has no further training or deeper understanding try to apply machine learning to life-or-death problems.
Sure, but most problems aren't life-or-death. They're mundane problems related to improving a business process for a widget manufacturer, etc.
> Personally I don't think you should be allowed anywhere near ML, UNLESS you have that PHD in computer science.
I rather have people have a PhD in logic or ethics, so they hopefully wont make for example racist programs without even thinking about it. Unfortunately, so far i dont have a lot of faith in computer science as a field when it comes to ethics, as it's all about the technical challenge. That something can be build does not mean it should be build.
So yeah, with respect to your distinction of "ML research" and "applied ML". "NO."
Interesting. I don't see anything in the story above that supports that position. There's nothing about the scenario you described that would be affected by having, or not having, a Phd in CS. If anything, as somebody else pointed out, this is more of question of philosophy or ethics.
It's also a pretty niche example, which is not representative of the kinds of things that ML can be used for. If you want somebody who work on pricing optimization or a recommender system for your e-commerce site, you really don't need somebody who's doing cutting edge ML research.
And of course all this happens along a spectrum from "requires the least theory/math" to "requires the most theory/math". It's not a strictly binary dichotomy.
So is the person completing the fast.ai course going to be the inventor of the next great new NN training algorithm? Maybe not. Probably not, even. But are they going to be able to apply ML to solve real problems? Yeah, most likely.