> And I disagree that the number of subjects doesn't matter. It matters enormously, precisely because this is a biology experiment.
It's ironic that you say this so definitively when you were the one asking for help interpreting the statistics originally.
Lisper is completely correct. Given a large enough effect size, detecting significant differences in a small population is very possible.
The statistics are sound. The assumption we should be questioning is where the subjects came from and if they are actually representative of the population that we are extrapolating this result to.
What Lisper said may be correct, however he really doesn't mention the application of the null-hypothesis in this experiment. To a skeptic it would seem the null hypothesis might be susceptible to the confusion of correlation and causation.
Edit:
The correct interpretation is that the model, the foundation of the null hypothesis, didn't fail yet.
Part of this model is backed by more experiments: "The reason that we focused on SWA is that it is the only sleep characteristic that reflects the depth of sleep" [1].
The model doesn't consist of a single variable. 11 people choosing the same 7 numbers out of 49 by chance is rather unlikely. The null hypothesis would include that there are only 11 people picking, that they don't cheat, and that random chance is indeed a thing. If now 11 people would indeed all choose the same, then the experiment could be repeated, e.g. to show that they are cheating or to increase the significance.
> It matters enormously, precisely because this is a biology experiment.
No. If I advance the hypothesis that reciting the Kama Sutra backwards will make you grow a third arm, then a single subject who recites the Kama Sutra backwards and shortly thereafter grows a third arm would be a statistically significant result, because the odds of someone growing a third arm by chance are quite small.
> "I disagree that the number of subjects doesn't matter."
you're asking for people with deeper knowledge than you for their help, and then disagreeing with them?
I get it -- you're skeptical because of sample size. But recall that "sample size" isn't necessarily the number of test subjects, but the number of measurements, as the comment about Mercury demonstrates.
Moreover, a strong signal can be detected even from a small sample size; a coin flipping the same way 11 times in a row could be mere chance, but a roulette wheel hitting the same number 11 times in a row should not happen in the entire history of the universe by mere chance. It's not just the number of measurements, but the significance of each one, that matter for statistical confidence.
Sure, you could get an even stronger signal with a larger sample size. But that doesn't mean "a study of 11 people" is necessarily insignificant or too small of a sample. It might be too small, or it might be enough to have a very high degree of certainty.
Your analogy to Mercury or flipping coins isn't helpful. (Nor is Kama Sutra below.)
I was talking about this experiment, the one the article is about, the one which I linked in my comment. The one which is apparently attracting modestly widespread media coverage.
The analogies are meant to demonstrate that sample size is (1) not as straightforward as asking "how many subjects" and (2) less important in circumstances where the signal is strong.
In other words, we're trying to give you the tools to evaluate for yourself. The initial article is about "an experiment involving 11 people", and your skepticism seemed to be about whether it's even possible to get valid results with that sample size (answer: yes). Even in this study, yes, you can get a valid result from "only" 11 people.
And I disagree that the number of subjects doesn't matter. It matters enormously, precisely because this is a biology experiment.