> I wouldn't consider estimating, say, the mean length of a population of fish c...

CrazyStat · on July 3, 2020

> But when one fine day you decide to do something more complex, these are the land mines that you shouldn't really ignore.

> in many real scenarios flat priors are not ok.

> eventually one day you will change lanes on the highway at the exact wrong time, and you will really regret your habit of not looking in your mirror.

Can you give some examples where frequentists hit these alleged flat-prior landmines? I am admittedly a Bayesian by training, not a frequentist, so perhaps it's just my ignorance showing, but I'm not aware of any such situations.

Frequentist statistics generally relies on performance guarantees (bounds on the false positive error rate for tests, in particular, and coverage for confidence intervals) which are derived under the lack-of-explicit-prior, so as far as I can tell they should be doing fine. I'd be interested in seeing examples where frequentist analyses fail because of the (implicit) flat prior.

> we're almost always choosing a very simplistic distribution (e.g. Gaussian) to do this

The Gaussian distribution is a marvelous thing. The central limit theorem is, in my humble opinion, one of the most beautiful and surprising results in mathematics.

> Is it not much better to discuss the full posterior, "degrees of belief" and to be explicit about all of our uncomfortable prior assumptions?

Perhaps I'm just cynical, but I'd say probably not. A Bayesian decision process is still a decision process and still subject to all the problems that the frequentist decision process (null hypothesis significance testing) is subject to: inflated family-wise error rates, p-hacking (except with Bayes factors rather than p-values), publication bias, and so on. At best getting everyone to do Bayesian analyses might be roughly equivalent to getting everyone to use a lower default significance threshold, like 0.005 instead of 0.05 (which prominent statisticians have advocated for).

> I prefer Bayesian model selection over null hypothesis rejection 100% of the time, especially because "Bayesian model selection" is the only logical way to do model selection, the only caveat is that it depends on reasonable prior assumptions and these are the hard part (but again, at least it is explicit!).

Sadly there's a trap in Bayesian model selection (often called Bartlett's paradox, though it's essentially the same thing as Lindley's paradox) which can be difficult to spot. No names out of respect to the victim, but several years ago I saw a very experienced Bayesian statistician who has published papers about Lindley's paradox fall prey to this. Explicit priors didn't help him at all. He would not have fallen into it if he had used a frequentist model selection method, though there are other problems with that.

> Also, the Lindley's "paradox" example certainly seems contrived:

And here we are again calling a statistical test that thousands of people do every day "contrived." You already know how I feel about that.

Yes, it's a very simple example, because that helps illustrate what's happening. Lindley's paradox can happen in arbitrarily complex models, any time you're doing model selection.

> By contrived, I mean that it appears designed to exploit the fact that Bayesian stats will automatically prefer simpler models, especially one with 0 degrees of freedom that is relatively close to the right answer, but that's a Good Thing (TM).

Preferring simpler models is not exactly what's going on in Lindley's paradox, at least not the way that most people talk about Bayes factors preferring simpler models (e.g. by reference to the k*ln(n) term in the Bayesian Information Criterion). The BIC is based on an asymptotic equivalence and drops a constant term. That constant term is actually what is primarily responsible for Lindley's paradox, and has only an indirect relationship to the complexity of the model.

astrophysician · on July 7, 2020

Hi, sorry for the delayed response!

> Can you give some examples where frequentists hit these alleged flat-prior landmines? I am admittedly a Bayesian by training, not a frequentist, so perhaps it's just my ignorance showing, but I'm not aware of any such situations.

You're probably right: I myself am also a Bayesian by training as you can probably guess but went through the usual statistics education from a frequentist standpoint, and once I learned Bayesian statistics it was almost an epiphany and much more intuitive and understandable than the frequentist interpretation (but that's just me). In all honesty, I think good frequentist statisticians and good Bayesian statisticians have nothing to worry about, since both should know exactly what they are doing and saying as well as the limitations of their analysis.

I wouldn't put myself in either the "good frequentist" or "good Bayesian" categories, by the way, I am just an imperfect practitioner, but I think that's the case for most people. My argument against frequentist statistics for the masses is a practical one: I found myself getting into much more trouble and having much less insight into what I was doing when I had a frequentist background than I did when doing things from a Bayesian standpoint, and I see many imperfect frequentist statisticians like myself running into the same problems I used to (mostly ignoring priors when they shouldn't or thinking a flat prior is always uninformative, etc.), but I admit that is a wholly subjective experience. I never once thought about priors before learning Bayesian stats, and I find many people I meet with a frequentist background also forget the significance of priors because they also are imperfect practitioners.

> Frequentist statistics generally relies on performance guarantees (bounds on the false positive error rate for tests, in particular, and coverage for confidence intervals) which are derived under the lack-of-explicit-prior, so as far as I can tell they should be doing fine. I'd be interested in seeing examples where frequentist analyses fail because of the (implicit) flat prior.

Yea, I totally agree, I just find that statistics is important in many more contexts than just this. While you can do this sort of thing from a Bayesian perspective (using Jeffery's priors or whatever the situation calls for), in my experience frequentists have a tough time departing from this type of analysis once they start diving into areas where priors are important (unless they are also familiar with Bayesian stats!)

> The Gaussian distribution is a marvelous thing. The central limit theorem is, in my humble opinion, one of the most beautiful and surprising results in mathematics.

Agree with you, but CLM doesn't always help you. You may not always be interested in the statistics of averages in the limit of many samples. I agree when you are doing this, CLM is a godsend.

> Perhaps I'm just cynical, but I'd say probably not. A Bayesian decision process is still a decision process and still subject to all the problems that the frequentist decision process (null hypothesis significance testing) is subject to: inflated family-wise error rates, p-hacking (except with Bayes factors rather than p-values), publication bias, and so on. At best getting everyone to do Bayesian analyses might be roughly equivalent to getting everyone to use a lower default significance threshold, like 0.005 instead of 0.05 (which prominent statisticians have advocated for).

I disagree here. Discussing the full posterior forces you not to reduce the analysis to a simple number like a significance threshold, and to acknowledge the fact that there are actually a wide range of possibilities for different parameter values, and it's important to do this when your posterior isn't nice and unimodal, etc. I don't disagree that sometimes (well, many times) the significance threshold is all you really care about (e.g. "is this treatment effective, yes or no"), but that's still a subset of where statistics is used in the wild. E.g. try doing cosmology with just frequentist statistics (actually, do not do that, you may be physically attacked at conferences).

But again, I want to emphasize that doing Bayesian stats can also give you a false sense of confidence in your results, I don't mean to say Bayesians are right and frequentists are wrong or anything, I just mean to say that sometimes priors are important and sometimes they aren't, and I personally find that have an easier time understanding when to use different priors in a Bayesian framework than a frequentist one.

> Sadly there's a trap in Bayesian model selection (often called Bartlett's paradox, though it's essentially the same thing as Lindley's paradox) which can be difficult to spot. No names out of respect to the victim, but several years ago I saw a very experienced Bayesian statistician who has published papers about Lindley's paradox fall prey to this. Explicit priors didn't help him at all. He would not have fallen into it if he had used a frequentist model selection method, though there are other problems with that.

Like you say, there are problems with both approaches, but my point is that when the prior is explicit, we can all argue about its effects on the result or lack thereof. Explicit priors don't "help" you, but they force you to make your assumptions explicit and part of the discussion. If your only ever using flat priors, it's easy to forget that they're there

> And here we are again calling a statistical test that thousands of people do every day "contrived." You already know how I feel about that.

I don't mean to be flippant about it or dismissive, I mean exactly what I said:

contrived: "deliberately created rather than arising naturally or spontaneously."

What test is it in lindley's paradox are you referring to when you say thousands of people use everyday? Just the null rejection? Or is there another part of it you're referring to?

> Yes, it's a very simple example, because that helps illustrate what's happening. Lindley's paradox can happen in arbitrarily complex models, any time you're doing model selection.

My point isn't that it's simple, my point is that it's incredibly awkward and unrealistic and not representative of how a Bayesian statistician would answer the question "is p=0.5" which is a very strange question to begin with. The "prior" here treats it as equally likely that p=0.5 exactly and p != 0.5, which if that's your true assumption, fine, but my point is that this is a very bizarre and unrealistic assumption. Maybe it seems realistic to a frequentist but not to me at all. If someone was doing this analysis, I would expect to get a weird answer to a weird question.

> Preferring simpler models is not exactly what's going on in Lindley's paradox,

Exactly! I'm not sure what is going on in Lindley's paradox to be honest; I don't understand the controversy here: the question poses a very strange prior that seems designed to look perfectly reasonable to a frequentist but not to a Bayesian. But I suppose this is an important point about the way priors can fool you!

> at least not the way that most people talk about Bayes factors preferring simpler models (e.g. by reference to the kln(n) term in the Bayesian Information Criterion). The BIC is based on an asymptotic equivalence and drops a constant term.

I'm with you so far, and BIC is a good asymptotic result, but I'm talking about the full solution here (which is rarely practical*), that doesn't drop the constant term

> That constant term is actually what is primarily responsible for Lindley's paradox, and has only an indirect relationship to the complexity of the model.

I mean I think we're splitting hairs here? Maybe? My point was that Bayesian model selection won't make up for a strange prior, but given the right priors, Bayesian model selection just makes sense to me. But again, this is the important limitation of most Bayesian analyses: the prior can do strange things, especially the one used in the Lindley's paradox example in the Wikipedia page.

But honestly, if you think I'm missing some important part of Lindleys' paradox, please do elaborate, I have not heard of this before you mentioned it but I still am confused as to why this is considered something "deep" but I assume that just means I am missing something important.