Generalized linear models, abridged

imurray · on July 26, 2016

If you mainly care about prediction, rather than inspecting the fitted parameters, a lot of this detail is usually overkill.

To generalize well, it's almost always a good idea to have some sort of regularization, such as penalizing the sum of the square of the parameters. The extra term in the cost function will usually make the naive "normal equations" approach work fine, and give much the same predictions as fancy pivoted QR approaches. On my machine it's also a lot faster (the ball-park is ~~10x faster for large systems).

I'm glad R has super-solid robust GLM implementations. And unless you're fitting many models, you should probably just use such a library routine. However, I wish more tutorials and textbooks would spend more time on the reasons for numerical stability, and when one should care, rather than pushing that detail off into a trail of citations.

rrmm · on July 25, 2016

Nice overview. I thought it was a bit odd that they pulled out a stieltjes integral so early on to define the mean and variance (I = Integral x dF(x)). I wouldn't expect most people reading this sort of introductory material to be familiar with it.

Did it trip anyone up?

rcthompson · on July 26, 2016

I suppose it's the only single definition that works for both continuous and discrete variables. Although I actually read it as the more common Riemann integral definition (i.e. mean = integral of x*f(x)dx) until you pointed it out.

AtheMathmo · on July 25, 2016

This is a great post! I tried implementing my own GLMs [1] a short while ago. But I ran into a lot of trouble with numerical instability and had a hard time tracking down ways to solve these edge cases.

Hopefully with this as a resource I'll be able to make some more progress on it!

[1]: https://github.com/AtheMathmo/rusty-machine/blob/master/src/...

Rexxar · on July 26, 2016

If the author read comments, this hard coded "http" seems to broke mathJax rendering in Firefox/Chrome with "HTTPS Everywhere".

  <script type="text/javascript"
    src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
  </script>

Everything is converted to https except this link and Firefox/Chrome then refuse to load the javascript on http from an https page.

mrdmnd · on July 26, 2016

A good resource here is Trefethen and Bau's Numerical Linear Algebra.

blahi · on July 25, 2016

That... exceeded expectations actually.