Hacker News new | past | comments | ask | show | jobs | submit login

From the end of TFA:

>The book is not without its weak moments, although they are few. One in particular which I recall is the treatment of A/B testing. Essential to any hypothesis testing is the matter of how to reduce the sampling mechanism to a simple probabilistic model, so that a quantitative test may be derived. The book emphasizes one such model: simple random sampling from a population, which then involves the standard probabilistic ideas of binomial and multinomial distributions, along with the normal approximation to these. Thus, one obtains the z-test.

>In the context of randomized controlled experiments, where a set of subjects is randomly assigned to either a control or treatment group, the simple random sampling model is inapplicable. Nonetheless, when asking whether the treatment has an effect there is a suitable (two-sample) z-test. The mathematical ideas behind it are necessarily different from those of the previously mentioned z-test, because the sampling mechanism here is different, but the end result looks the same. Why this works out as it does is explained rather opaquely in the book, since the authors never developed the probabilistic tools necessary to make sense of it (here one would find at least a mention of hypergeometric distributions). Given the emphasis placed in the beginning of the book on the importance of randomized, controlled experiments in statistics, it feels like this topic is getting short-shrift.

Can anyone recommend good resources to fill this alleged gap?




I found that I learned a lot about RCTs by going beyond RCTs and reading about causal inference. You learn why each assumption is important when it's broken.

'Causal Inference: What If' is a nice intro and freely available: https://www.hsph.harvard.edu/miguel-hernan/causal-inference-...


In the book, Freedman states that two assumptions of the standard error of the difference are violated by the way subjects are assigned to control and treatment groups in randomized controlled trials (RCTs).

The standard error of the difference assumes that a) samples are drawn independently, i.e., with replacement; and b) that the two groups are independent of each other. By samples being drawn, I mean a subject being assigned to a group in a RCT here.

If you derive the standard error of the difference, there are two covariance terms that are zero when these assumptions are true. When they're violated, like in RCTs, the covariances are non-zero and should in theory be accounted for. However, Freedman implies that it doesn't actually matter because they effectively cancel each other out, as one inflates the standard error and the other deflates it.


> gap?

E. L. Lehmann, 'Nonparametrics: Statistical Methods Based on Ranks', ISBN 0-8162-4994-6, Holden-Day, San Francisco, 1975.

Sidney Siegel, 'Nonparametric Statistics for the Behavioral Sciences', McGraw-Hill, New York, 1956.

Bradley Efron, 'The Jackknife, the Bootstrap, and Other Resampling Plans', ISBN 0-89871-179-7, SIAM, Philadelphia, 1982.

Hypothesis testing?? Somewhere maybe I still have my little paper I wrote on using the Hahn decomposition and the Radon-Nikodym theorem to give a relatively general proof of the Neyman-Pearson theorem about the most powerful hypothesis test.


I’d ignore the critique completely - it lacks internal consistency. The similar final result is due to central limit theorem, which is a large n result, and actually lets you ignore the hypergeometric construct and use a binomial instead since those are similar for large n. [edit: grammar]




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: