Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> To produce a good analysis using either Bayesian or frequentist methodology (or to criticise such an analysis), you have to have deep domain knowledge. There's no getting around that, and arguably the use of p-values often lets you get away with shoddy domain knowledge.

The whole problem we're facing is that it requires too much domain knowledge and detailed analysis to dismiss results that are actually just noise. The whole point of p-values is that they give you a way to do that without needing that complex analysis with deep domain knowledge - they're not a replacement for doing in-depth analysis, they're a way to cull the worst of the chaff before you do, the statistical-analysis equivalent of FizzBuzz. Bayesianism has no substitute for that (you can't say anything until you've defined your prior, which requires deep domain knowledge), and as such makes the problem much worse.



> (you can't say anything until you've defined your prior, which requires deep domain knowledge)

Well, you can use a non-informative prior. And that's the correct choice when you genuinely don't have a better option. But you should always be able to justify that, and that in turn requires deep domain knowledge....which leads me to....

> The whole problem we're facing is that it requires too much domain knowledge and detailed analysis to dismiss results that are actually just noise.

....this is in no way a "problem" that needs fixing, by allowing shortcuts that can easily be hacked. Rather, it's a factual statement about the difficulty of drawing correct conclusions, in low Signal-to-Noise-Ratio domains. Whether you use p-values or not, and whether you use Bayesian methodology or not, you cannot get around the need to understand the data you're working with. Bad p-values are worse than none, since you have no knowledge of what error rate they actually achieve in the long-run.

> Bayesianism has no substitute for that

Yes it does. It's called Bayes factors. But as I said above, I completely disagree with your view of what a p-value is for.


> Well, you can use a non-informative prior. And that's the correct choice when you genuinely don't have a better option.

At which point you've just found a more cumbersome way to do frequentist statistics. Frequentist tools aren't inconsistent with Bayes' law (they can't be, since both are valid theorems) - indeed one could say that the whole project of frequentist statistics consists of building a well-understood suite of pre-baked priors and computations that are appropriate to situations that are commonly encountered.

> ....this is in no way a "problem" that needs fixing, by allowing shortcuts that can easily be hacked. Rather, it's a factual statement about the difficulty of drawing correct conclusions, in low Signal-to-Noise-Ratio domains. Whether you use p-values or not, and whether you use Bayesian methodology or not, you cannot get around the need to understand the data you're working with.

Well, the fact is there are too many small-sample studies being produced for all or even most of them to be critically analysed by people with deep understanding. And maybe the right fix for the problem is to give the right incentives for that kind of critical analysis (e.g. by allowing that kind of analysis to count as research for the purposes of journal publications and PhD theses just as much as "the original study" does, given that a study without that kind of critical analysis cannot truly be said to represent advancing human knowledge). But if you just tell people to do Bayesian analysis instead of frequentist analysis then that's not going to magically create deep understanding - rather people will try to replace shallow frequentist analysis with shallow Bayesian analysis, and shallow Bayesian analysis is a lot less effective and more hackable.

> Yes it does. It's called Bayes factors.

But you still need a prior to compute a Bayes factor.


> At which point you've just found a more cumbersome way to do frequentist statistics.

Hmm, in one way, yes...but on the other hand, Bayesian posteriors are a lot more intuitive to interpret, for most people. So I think you trade one form of convenience for another. But as you sort of hint at, the results should usually be fairly similar, whether you're doing frequentist or Bayesian analysis. So in most cases, I doubt it matters that much. Where it does matter, is when you have grounds for strong priors, that you want to take advantage of. In such cases you can improve your chances of being correct in the "here and now", if you do a Bayesian analysis. Whereas a frequentist analysis is only concerned with the asymptotic error rates. (but of course frequentist vs Bayesian is also a ladder, rather than a black and white distinction)

> Well, the fact is there are too many small-sample studies being produced for all or even most of them to be critically analysed by people with deep understanding.

And this I totally agree with. If there's one thing I dislike about academia, it's the tendency to fund low-powered studies that get nowhere. Better to go all in, with sufficient support from experienced people, in fewer and bigger studies.


> So in most cases, I doubt it matters that much. Where it does matter, is when you have grounds for strong priors, that you want to take advantage of. In such cases you can improve your chances of being correct in the "here and now", if you do a Bayesian analysis.

I completely agree with this - but it's exactly this dynamic that I think, at least in the current academic environment, does more harm than good. Effectively it normalizes publishing a result that's not strong enough to swamp the prior, but where you have some detailed situational argument for why a different prior should be used here. We already get every social science paper arguing that they should be allowed to use a 1-tailed t-test rather than 2-tailed because surely there's no possibility that their intervention would do more harm than good, and you need to get into the details of the paper to see why that's nonsense; letting them pick their own prior multiplies that kind of thing many times over.


> letting them pick their own prior multiplies that kind of thing many times over.

I'm a big fan of sensitivity analysis in this context. Don't just pick one prior and call it a day, but show the effect of having liberal vs conservative priors, and discuss that in light of the domain knowledge. That gives the next researcher a much better foundation than a single prior, or a p-value, ever could.

Unfortunately, if it was a non-trivial paper to begin with, it now just turned into a whole book.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: