The 5% figure means that, when there is no signal to detect, we have a 5% chance of falsely claiming there is one. It does not say anything about the case when there is a signal and we do not detect it, which is known as the type II error rate.
With reasonable assumptions about sample size and the fraction of times there really is a signal, you can find that the majority of published results are false:
> It's crazy that by using a p-value of 0.05, it means that 5% of all scientific results might be false.
That would only be the case if scientists were robots who immediately published anything with a p-value up to 0.05. They're not, though. If they get clearly nonsensical results, they will obviously re-evaluate it. In other words, the p-value doesn't incorporate the fact that the experiment passed sanity checks in your own head (and the reviewers') before it was published. (And yes, there are bad actors in every field who game the system, but my point still stands.)
This sounds like some kind of comment about divisibility of the work to publish something. I don't get it though.
After the bulk of the paper is written, I can easily proofread, typeset, etc everything myself in less than a week. Now get someone else to double check that. Lets say that is another week.
After that the only thing is to get someone worthwhile to spend some time on your paper and point out anything confusing or erroneous. Granted, this could take a month or so of study. However, I never really saw that happen in practice. In reality you would be lucky to get people to glance over it one evening.
In my experience a significant fraction of the time it takes to publish a paper is spent waiting for the journal. During that time you can do other useful research. The long delay between submitting, getting through the reviewers and the actual publication is one of the reasons why for example in CS a lot of the interesting stuff happens in conference publications with fast turnarounds and the journal versions of the same paper appear a year or two later.
I suspect this factor is balanced, or completely overruled, g the scientists who get p values greater than 0.05, decide that result doesn’t pass their sanity check (it clearly should be significant!) and collect more data or tweak the methods until it’s significant.
5% false discovery rate is true if the apriori probability of each result is 50%. If a journal wants to publish only surprising results, and accepts p=0.05 with a good methodology as true, it will have higher rate of false claims, because the most interesting things are more surprising than a coin toss.
And if you slice the data from a single experiment in 40 independent ways, your chances to get something with purely random significance p<0.05 are better than 50% for a single study…
It doesn't "mean" that the a priori is 50%, I guess what you want to say is that if we consider the a priori probabilities are equal between the two hypothesis, then we can use the 5% p-value.
But if the probability of the hypothesis is much less likely, we might need a p-value much lower to be sure.
Edit: my comment doesn't mean much since you edited your first sentence.
>"It's crazy that by using a p-value of 0.05, it means that 5% of all scientific results might be false."
How so? I don't see any connection between significance level and % of false scientific results at all.
If you assume the "null hypothesis" is always true, then 5% of the results should falsely say otherwise. Of course, this is if all the assumptions behind the math hold, no p-hacking, etc.
However, that is like saying it is extremely rare for there to be a correlation between any two phenomena. We don't live in that universe. In our universe, correlations are extremely common:
>"These armchair considerations are borne out by the finding that in psychological
and sociological investigations involving very large numbers of subjects, it is regularly
found that almost all correlations or differences between means are statistically
significant. See, for example, the papers by Bakan [1] and Nunnally [8].
Data currently being analyzed by Dr. David Lykken and myself, derived from a
huge sample of over 55,000 Minnesota high school seniors, reveal statistically significant
relationships in 91% of pairwise associations among a congeries of 45 miscellaneous
variables such as sex, birth order, religious preference, number of siblings,
vocational choice, club membership, college choice, mother's education, dancing,
interest in woodworking, liking for school, and the like. The 9% of non-significant
associations are heavily concentrated among a small minority of variables having
dubious reliability, or involving arbitrary groupings of non-homogeneous or nonmonotonic
sub-categories. The majority of variables exhibited significant relationships
with all but three of the others, often at a very high confidence level"
It's crazy that by using a p-value of 0.05, it means that 5% of all scientific results might be false.