A high bias low-variance introduction to Machine Learning for physicists

ivan_ah · on Aug 16, 2018

This is very, very, very good.

Book is here: https://arxiv.org/abs/1803.08823

____

My review: The authors provide a condensed summary of all central topics in machine learning. Topics include ML basics, ML theory, optimization algorithms, but also a detailed introduction to modern methods deep learning methods. Code examples and tutorials are provided as jupyter notebooks for each chapter [2]. The book uses three datasets (MNIST digit recognition, SUSY physics data, and simulated Nearest Neighbor Ising Model) as running examples throughout the book to help learners understand what different ML techniques can bring when analyzing the same problems from different perspectives.

The book has a high bias (since written from physics perspective), but low variance since assuming physics background allows authors to write a very focussed narrative that gets to the point, and communicates three-books-worth of information in 100 pages. This is somewhat of a repeat of the general physics-ML-explanations-for-the-win pattern established in Bishop's `Pattern Recognition and Machine Learning`.

The authors are wrong to label this book as useful only to people with a physics background, and in fact it will be useful for everyone who wants to learn modern ML. An estimator with high-bias but high efficiency is always useful.

____

For all my hacker news peeps that wants to learn ML and/or DL, you need to drop everything right now, go print this on the office printer, and sit outside with coffee for the next two weeks and read through this entire thing. Turn off the computer and phone. Stop checking HN for two weeks. Trust me, nothing better than this will come around on HN anytime soon.

[1] book pdf => https://arxiv.org/pdf/1803.08823 [2] jupyter notebooks zip => http://physics.bu.edu/~pankajm/ML-Notebooks/NotebooksforMLRe...

rerx · on Aug 16, 2018

Wow, that must be one of the most enticing endorsements I have ever read. I am definitely checking this out soon. (Physicist by training, now working with deep learning)

Kroniker · on Aug 16, 2018

There are literally dozens of us!

stochastic_monk · on Aug 16, 2018

It fits quite well. In particular, many of our formalisms are used (if altered) all over machine learning. Having training as a physicist gives you a wide range of analytical skills. I’ve seen former physicists doing everything from clinical pediatric cancer or bioinformatics to do chip design, financial work, and ML.

sqrt17 · on Aug 16, 2018

I cringed a little at "many of our formalisms". The maths of thermodynamics is so useful yet it took and takes substantial effort to pry the useful maths away from the thermodynamics gobbledigook that surrounds it.

But it's certainly true that Deep Learning with its combination of mathy underpinnings and poorly understood behaviour is something that fits very well with a physicist's skillset.

mlevental · on Aug 16, 2018

it's 100 pages because it's double column and like 10 point font lol. but still a good concise survey. will indeed peruse on coffee breaks. also kudos on the efficiency pun :)

pure-awesome · on Aug 16, 2018

I get the idea that the title "High Bias, Low Variance" is some sort of reference or pun, but I'm not quite getting it...

I mean, I know that there's a bias-variance tradeoff in stats and ML, but what does it mean in the context of introduction to ML for physicists?

My guess is they mean they aren't going into as heavy detail in ML, which means the reader may lack some knowledge (high bias) but won't miss the forest/wood for the trees (low variance).

Anyone else care to speculate?

gabrielgoh · on Aug 16, 2018

I guess the author is not claiming a well rounded (low bias) introduction to ML to statistics, but a highly biased and specialized (low variance) course that is tailored to the author's own interests and tastes.

pure-awesome · on Aug 16, 2018

Quite possibly.

tanilama · on Aug 16, 2018

High-bias, meaning it is opinionated from author's own perspective.

Low-variance, I think the author means to have a sharp and quality focus on the subjects he/she is going to talk about.

yoquan · on Aug 16, 2018

Almost agree to what you said. Just a piece of more context: the 1st author wrote about connection between physics's Renormalization Group and Deep learning which was featured in [1] 4 years ago.

[1]: https://www.quantamagazine.org/deep-learning-relies-on-renor...

pure-awesome · on Aug 17, 2018

Seems the most plausible explanation to me, so far.

ivan_ah · on Aug 16, 2018

Yeah after looking through the PDF, I'm still wondering what they meant. Here are some conjectures:

- The book has a high bias (since written from physics perspective), but low variance since assuming physics background allows authors to write a very focussed narrative that works for physicists (low variance on subset of physisicts readers).

- If we define the problem of choosing which topics t1, t2, ..., tn to cover in a ML book, the authors are saying the using the physicists approach will lead to consistency good results (i.e. not all over the place). It something like an argument from physicists-are-sensible people, unlike CS theoreticians, industry practicians, and mathematicians.

aisofteng · on Aug 16, 2018

The reference is to the bias-variance tradeoff: https://en.wikipedia.org/wiki/Bias–variance_tradeoff

pure-awesome · on Aug 16, 2018

Yes, I know. I mention that in my original comment.

currymj · on Aug 16, 2018

I interpreted “high-bias” as meaning it has an idiosyncratic and specific audience, but “low variance” as meaning with high probability it will be an effective teaching tool for that audience.

pure-awesome · on Aug 17, 2018

Yeah, seems like a sane interpretation.

currymj · on Aug 17, 2018

all these different interpretations suggest that lots of people, myself included, have a fuzzier understanding of the bias-variance tradeoff than we'd care to admit.

curious1212 · on Aug 16, 2018

my take on this: The high bias, low variance means that the models used in the book are the standard used in physic. The low variance means that the author doesn't use any artificial model or parameter except those that arise from physical considerations. Care is taken to motivate physically ML models and those far from physic can't be used with many parameters (hyperparameters).

salty_biscuits · on Aug 16, 2018

The "model" presented for ML is low complexity, hence lives on the high bias, low variance side of the space of possible models.

shamino · on Aug 16, 2018

I love it when physicists teach. Concise and pedagogical. Gives us physicists a good name :)

Michael Hartl (Caltech Physicist) also does a killer job with Rails Tutorial. There's definitely a trend.