You are assuming that distributions we work with are normal, and the sampled val...

stdbrouw · on Sept 20, 2015

That's nonsense. The entire point of the bootstrap is that it does not require the either the original or the sampling distribution to be normal. (For that matter, due to the Central Limit Theorem, neither do analytic approximations like a t-test for comparing different group means.)

You misunderstand the point about the Cauchy distribution in the answer on Cross Validated. The Cauchy distribution is a degenerate case, mostly interesting as an academic toy because it has infinite variance. Of course that's not going to fare well.

Dependent data can be tricky to deal with, but you can bootstrap such data by removing the dependence, bootstrapping the independent data, and adding the dependence back in. This sounds hard but is usually as easy as running a regression and subtracting/adding the component (x*beta) that leads to the dependency. Alternatively, for timeseries there's window methods.

Of course a short slide deck like the one linked to in this thread is not going to teach you all the finer points of bootstrapping and you can definitely do it wrong. But compared to all the assumptions that frequentist statistics makes to generate confidence intervals and the fact that you need a different method for every different scenario, bootstrapping is about as robust and idiot-proof as it's going to get. "It has its applicability" is beyond selling it short.

solomatov · on Sept 20, 2015

> For that matter, due to the Central Limit Theorem, neither do analytic approximations like a t-test for comparing different group means.

Actually, it doesn't. You need a confidence interval, and central limit theorem doesn't give you a confidence interval. It just says that the distribution is close enough to normal at some point. In some cases, it might take a very large n before it become close to normal.

stdbrouw · on Sept 20, 2015

And exactly how long it will take for the approximation to reach a certain level of accuracy can be ascertained by using the other technique mentioned in the slide deck: run a simulation. And so we have come full circle :-)

solomatov · on Sept 20, 2015

> The entire point of the bootstrap is that it does not require the either the original or the sampling distribution to be normal

What I wanted to say that you can just blindly apply bootstrap for every distribution. You should carefully check applicability conditions, before using it, or you could easily get nonsensical results.

stdbrouw · on Sept 20, 2015

You can't blindly apply the bootstrap to any problem whatsoever. But you really, absolutely, unequivocally can blindly apply the bootstrap to generate a confidence interval around the mean of data generated from any distribution. You cannot easily get nonsensical results. In a few rare edge cases, you can get suboptimal results (e.g. 80% instead of 95% coverage, or 99% coverage instead of 95% coverage) but even these are not nonsensical. I really don't see why you want to be arguing over facts.

solomatov · on Sept 20, 2015

But what about the example with uniform distribution? How can we get confidence interval in this case with bootstrap?

stdbrouw · on Sept 20, 2015

In the example you link to, the example does not estimate the mean of a uniform distribution but rather its maximum – U(0, max). A maximum is like a quantile. One of the very few shortcomings of bootstrapping in realistic scenarios is that it does not always do well with generating confidence intervals around quantiles. And still scientists can go an entire lifetime without ever feeling the need to estimate the maximum of a uniform distribution. The only famous example of this distribution in the context of actual science is the German tank problem: https://en.wikipedia.org/wiki/German_tank_problem