We're using this book for a "book club" at work. Doing 1 chapter every 2 weeks. Chapter 1 covers Jupyter, 2 covers numpy, 3 pandas, 4 matplotlib, and 5 machine learning. We just made it through the first 4 chapters and it lays a good foundation for those libraries. I suspect chapter 5 is the meatiest and most interesting chapter, which covers scikit-learn and machine learning techniques. It is a long chapter so we will spend a month on it.
I recommend combining this book with McKinney's Pandas book[1] and the author's excellent YouTube presentations at PyCon and PyData. Start with "Statistics for Hackers"[2] by Jake VanderPlas and then look for his others.
The notebooks say they are an excerpt from the book but some other place mentions you can read the book in its entirety in the posted link. So, the notebooks have all the content or part of the book?
To my knowledge, the notebooks include all, or almost entirely all, of the content in the print book. Jake mentions a few times in talks that the Notebooks are "compiled" into the O'Reilly book format. The nice thing about having the book as notebooks is you can literally "run the book as code" just by pointing Jupyter at that cloned repo.
I work with Jake (the author) at the eScience Institute at the University of Washington (though I'm merely a grad student) and can say that he is not only an extraordinary data scientist and educator but is a great guy as well. He worked extraordinarily hard on this, so I'm very glad to see it on the front page of HN––I'll be sure to show him the screenshot tomorrow!
I'll add that there's an emacs Jupyter mode if that's of any interest.
I agree in principle that a better plaintext format could be interesting, but I don't see how embedding graphs etc could be easily done without a special interface of some kind.
That said I can imagine a mode where you simply have any editor open on one half of the screen and a browser that autorefreshes on the other.. this is more or less how I work with Emacs and Evince when working on Latex and it's great. Synchronising vertical position with the cursor position might be a challenge.
You don't need a strong math background. I got through a few books (PDSH included) and a couple of MOOCs on data sci/ML over the last couple of years with high school level math skills + some extra reading. Not everything is explained in minute detail but there are plenty of other sources to supplement it if you want to go deeper.
You can buy it. Its available for Kindle on Amazon. Also available DRM-free on ebooks.com.
Still a bummer that O'Reilly stopped selling books directly. There's been so many recently published books that I'm interested in that I can no longer purchase.
One of the challenges with Wes book is that it is quite old (2014). A lot of commands/functions/code mentioned in the book are obsolete and removed so code fails.
The OP book is relatively recent (2016). The majority of code still runs as mentioned. Only a few commands/functions mentioned generate deprecation warning. This book is also covers packages and ML exhaustively. I have gone through this book cover to cover and enjoyed it. This is the first and only book that I found that covers data analysis with Python comprehensively. I wish author had covered data cleaning aspects little bit more.
Good to know. I've been recommending Wes's book for a some time now to people new to data work in Python, but between the discussion here and the consistently high quality of Jake's blog posts and demos, I'll have to keep this one in mind.
Out of curiosity, what sorts of material had you hoped he'd cover on data cleaning?
It is a great compliment to Wear McKinney's "Python for Data Analysis" it is more like a recipe book than the internals as Wes' book is. Also, JVP includes more than just Pandas and NumPy goodies.
Highly Recommend, and fork to create your own curated handbook.
I recommend combining this book with McKinney's Pandas book[1] and the author's excellent YouTube presentations at PyCon and PyData. Start with "Statistics for Hackers"[2] by Jake VanderPlas and then look for his others.
[1]: http://shop.oreilly.com/product/0636920050896.do
[2]: https://www.youtube.com/watch?v=Iq9DzN6mvYA