So this is a book written by applied mathematicians for applied mathematics (they state in the preface it’s for scientists, but some theoretical scientists and engineers are essentially applied mathematics). As a result, both the topics and the presentation are biased towards those types of people. For example, I’ve never seen in practice worry about the existence and uniqueness conditions for their gradient-based optimization algorithm in deep learning. However, that’s the kind of result those people do care about and academic papers are written on the topic. The title does say that this is a book on the theoretical underpinnings of the subject, so I am not surprised that it is written this way. People also don’t necessarily read these books cover-to-cover, but drill into the few chapters that use techniques relevant to what they themselves are researching. There was a similarly verbose monograph I used to use in my research, but only about 20-30 pages had the meat I was interested in.
This kind of book is more verbose than my liking both in terms of rigor and content. For example, they include Gronwall’s inequality as a lemma and prove it. The version that they use is a bit more general than the one I normally see, but Gronwall’s inequality is a very standard tool in analyzing ODEs and I have rigorous control theory books that state it without proof to avoid clutter (they do provide a reference to a proof). A lot of this verbosity comes about when your standard of proof is high and the assumptions you make are small.
Are there any books you recommend for deep learning that are written for developers who don't use math every day?
I suppose the goal would be to understand deep learning so that we know enough of what's going on but not to get stuck in math concepts that we probably don't know and won't use.
I am/was in this scenario. I'm sure there are other resources out there specifically aimed at developers, but a book I'm reading now is "Deep Learning From Scratch" by Seth Weidman. He takes a different approach, by explaining concepts in three distinct methods: a mathematical way, by using diagrams and by showing the code.
I like this approach because it allows me to connect the math to the problem, whereas otherwise you wouldn't have.
I think if you are truly trying to understand deep learning, you will never get to avoid the math because that's really what it is at it's core, a couple of (non-linear) functions chained together (obvious gross oversimplification).
This kind of book is more verbose than my liking both in terms of rigor and content. For example, they include Gronwall’s inequality as a lemma and prove it. The version that they use is a bit more general than the one I normally see, but Gronwall’s inequality is a very standard tool in analyzing ODEs and I have rigorous control theory books that state it without proof to avoid clutter (they do provide a reference to a proof). A lot of this verbosity comes about when your standard of proof is high and the assumptions you make are small.