As soon as we recognize plain old regression as machine learning, then we start ...

fractionalhare · on Nov 24, 2020

I think you're being facetious, but on the off-chance you're not, and for the benefit of others: averages are incredibly practically useful for modeling systems. Parameter estimation (which generalizes averages and applies to other distribution features like variance) is a foundational modeling methodology. It's useful for both understanding and forecasting data. Measures of central tendency are nearly always good (if obviously imperfect) models of systems.

Here is a trivial example: one of the best ways of modeling timeseries data, both in and out of sample, is to naively take the moving average. This is a rolling mean parameter estimate on n lagged values from the current timestep. Not only is this an excellent way of understanding the data (by decomposing it into seasonality, trend and residuals), it's a competitive benchmark for future values. The first step in timeseries analysis shouldn't be to reach for a neural network or even ARIMA. It should be to naively forecast forward using the mean.

You might be surprised at how difficult it is to beat that benchmark with cross-validation and no overfitting or look-ahead bias.

dr_dshiv · on Nov 24, 2020

Thank you for your fabulous response. I hope my provocative comment wasn't in bad humor, disrespectful or trolling. I too love averages and regressions. Thank you for proudly defending these marvelously simple and powerful tools.

thegginthesky · on Nov 24, 2020

Well, actually working with "averages" as baselines before you start experimenting with more complex ML models is a good habit.

Sure, they are dummy regressors [1], but they can be so useful for proving that your whatever ML model you choose is at least better than a dummy baseline. If your model can't beat it, then you need to develop a better one.

They can even be used as a place-holder model so you can develop your whole architecture surrounding it, while another teammate is iterating over more complex experiments.

You could also settle in for a moving average process as a first model in a time-series [2], because they are easy to implement and simple to reason about.

Never under-estimate the power of an "average".

[1] https://scikit-learn.org/stable/modules/generated/sklearn.du... [2] https://en.wikipedia.org/wiki/Moving-average_model