As soon as we recognize plain old regression as machine learning, then we start to see "averages" as models of systems and how practically useful could that be?
I think you're being facetious, but on the off-chance you're not, and for the benefit of others: averages are incredibly practically useful for modeling systems. Parameter estimation (which generalizes averages and applies to other distribution features like variance) is a foundational modeling methodology. It's useful for both understanding and forecasting data. Measures of central tendency are nearly always good (if obviously imperfect) models of systems.
Here is a trivial example: one of the best ways of modeling timeseries data, both in and out of sample, is to naively take the moving average. This is a rolling mean parameter estimate on n lagged values from the current timestep. Not only is this an excellent way of understanding the data (by decomposing it into seasonality, trend and residuals), it's a competitive benchmark for future values. The first step in timeseries analysis shouldn't be to reach for a neural network or even ARIMA. It should be to naively forecast forward using the mean.
You might be surprised at how difficult it is to beat that benchmark with cross-validation and no overfitting or look-ahead bias.
Thank you for your fabulous response. I hope my provocative comment wasn't in bad humor, disrespectful or trolling. I too love averages and regressions. Thank you for proudly defending these marvelously simple and powerful tools.
Well, actually working with "averages" as baselines before you start experimenting with more complex ML models is a good habit.
Sure, they are dummy regressors [1], but they can be so useful for proving that your whatever ML model you choose is at least better than a dummy baseline. If your model can't beat it, then you need to develop a better one.
They can even be used as a place-holder model so you can develop your whole architecture surrounding it, while another teammate is iterating over more complex experiments.
You could also settle in for a moving average process as a first model in a time-series [2], because they are easy to implement and simple to reason about.