The author makes it sound like statistics is this grand beautiful mathematical edifice and ML is just a bunch of number crunching with computers. That contrast is just unfair; a huge portion of stats is just made up of hacks and cookbook recipes. Statistics has probably done more damage to the world than any other discipline, by giving a sheen of respectability to fake science in fields like nutrition, psychology, economics, and medicine.
I'm particularly annoyed by the implication that statisticians have better understanding of the issue of overfitting ("why having p >> n means you can't do linear regression"). Vast segments of the scientific literature falls victim to a mistake that's fundamentally equivalent to overfitting, and the statisticians either didn't understand the mistake, or liked their cushy jobs too much to yell loudly about the problem. This is why we have fields where half of the published research findings are wrong.
> Statistics has probably done more damage to the world than any other discipline, by giving a sheen of respectability to fake science in fields like nutrition, psychology, economics, and medicine.
This seems really unfair. You can misuse statistics, but it's an extremely powerful tool when it's properly used and understood. Most powerful tools can be misused - you can write terrible code and (try to) publish bad mathematics too. But much of modern science would be intractable without statistics; including physics, chemistry, biology and applied math, because we'd be otherwise unable to draw reasonable conclusions from anything less than a totality of data.
As someone with a graduate education in probability snd statistics, I think it's fair to lay some of the blame for the reproducibility crisis at the feet of statisticians because of poor education. Statisticians should accept at least some responsibility if their students in non-math majors graduate without understanding the material, for sure.
But that being said, it should definitely be noted that actual statisticians have been talking about this crisis for decades. Statisticians have basically always known that there's nothing magical about the 95% significance threshold p <= 0.05, for example. And for the most part, it's not statisticians who are causing the bad science to occur. Rather it's a problem of non-statisticians using statistics without (qualified) peer review that they can't be expected to do correctly if it's not their core competency.
In my opinion it's something of a philosophical problem - many fields and journals are only realizing now that it's unreasonable to expect a e.g. professional psychologist to also be an expert statistician. Having a dedicated statistician - instead of another psychologist who hasn't reviewed the material since their upper undergrad course - is a giant leap forward in catching bad stats in new research.
> Statistics has probably done more damage to the world than any other discipline, by giving a sheen of respectability to fake science in fields like nutrition, psychology, economics, and medicine.
I think it's unfair to represent the class of people that misrepresent their findings (charlatans and liars) as a problem with statistics. I'd blame that on poor understanding of statistics and the publish or perish mindset of academia.
> the statisticians either didn't understand the mistake, or liked their cushy jobs too much to yell loudly about the problem
You're obviously not someone that considers themselves a statistician. I do, and we have been basically telling everyone that would listen that there are huge fundamental issues with the way many scientists hinge their whole careers on p-values and similar things. Whether that message has been properly received is another story. The American Statistical Association has even published multiple official statements cautioning against the use p-values, the 0.05 cutoff, and using a single quantity to assess the impact and validity of anything.
See [1] The ASA's Statement on p-Values: Context, Process, and Purpose and [2] Moving to a World Beyond "p < 0.05".
I'm particularly annoyed by the implication that statisticians have better understanding of the issue of overfitting ("why having p >> n means you can't do linear regression"). Vast segments of the scientific literature falls victim to a mistake that's fundamentally equivalent to overfitting, and the statisticians either didn't understand the mistake, or liked their cushy jobs too much to yell loudly about the problem. This is why we have fields where half of the published research findings are wrong.