* In 2000, a rule was passed that clinical trials must include their hypotheses (i.e. clinical outcomes) in their initial registration before the trial is conducted.
* From the graph, the number and proportion of clinical trials through NLHBI with positive outcomes (i.e. an improvement in some health condition) is far lower after 2000 than before.
It was previously easy for researchers to conduct a trial, and then find positive trends in the data to form positive outcomes by framing the data from the trial in a different manner, as if the trial were originally intended to test that positive outcome. This is a problem related to A Priori Probability, where you're essentially turning research, which could have been statistically applied to a larger population given some statistical confidence, into deductive reasoning which doesn't necessarily apply beyond the finite sample size [1] (also see @entee's comment and links for better explanation of why this is important).
This article is stating that the graph implies that, because there are fewer positive outcomes after this new rule in 2000 than before, most of the positive outcomes prior to 2000 (which likely lead to new FDA-approved drugs on the market) must have been illegitimate.
I would point out though, that at least from that graph, the last positive outcome shown prior to 2000 was in 1996, so it's not like it obviously continued right up until 2000 and then suddenly stopped. Even when that rule was imposed in 2000 (I'm not necessarily sure how the ruling was phased in), there was a 4-year gap since the last positive outcome before then, and I don't see another 4-year gap prior to that since before around 1979. I'm not saying I disagree with the article (obviously), but just that while the graphic used seems to imply a trend, it seems like more data would be needed to conclusively show that trend.
The 90s is also around the time when we pretty much ran out of small molecule blockbuster drugs (the very profitable but low hanging fruit) and started shifting over to much more complex multi-drug therapies like those for HIV and cancer. There are still a few such small molecule candidates left but the major markets like blood pressure, heart disease, arthritis, etc. are very saturated. As the clinic trials start to target more complex diseases with more complicated drugs and therapies it adds a large confounding factor to the decrease in the number of successful trials.
> We identified all large NHLBI supported RCTs between 1970 and 2012 evaluating drugs or dietary supplements for the treatment or prevention of cardiovascular disease.
So this is for a narrow field where there are effective drugs already and where all the low hangoing fruits are fully picked because the business opportunity is huge.
> This is a problem related to A Priori Probability, where you're essentially turning research, which could have been statistically applied to a larger population given some statistical confidence, into deductive reasoning which doesn't necessarily apply beyond the finite sample size [1] (also see @entee's comment and links for better explanation of why this is important).
Maybe you are using these terms differently than they are used professionally, but this is not correct if it is meant as a criticism of Bayesian methods.
There is nothing inherently better (and in fact there are several things potentially worse) about looking at the data with inductive reasoning about a potentially larger population with assumed baseline properties.
The problem here is simply p-hacking, whether by accident or intentional. This can be done whether you are performing Bayesian analysis involving a prior distribution, or performing frequentist analysis with assumptions about null hypothesis and subjectively chosen rejection criteria.
If you don't register your model validation procedures ahead of time, then after the fact you can data mine among many (or even all possible) model validation scores and only publish scores that show a positive effect.
It's not about one framework of statistics or the other. It's about statistical rigor for whatever framework. It's about pursuing more than one study. And it's about attempting to use holistic model validation techniques that cover wide ranges of anticipated outcomes, to hopefully drive down the chances that you develop a conclusion from the study based only upon a narrow set of outcomes that could be biased.
I am definitely not using the terms as they are used professionally (and certainly not a criticism of Bayesian methods in any way).
It was an attempt to explain in more lay terms how when p-hacking is performed on some dataset, the "findings" can rarely be applied to larger (or different) datasets due to the arbitrary criteria used to reject the parts of the data that aren't conducive to the selected outcome, since the rejection criteria were sort of custom-fit to that specific dataset.
Was it possible before this change to run a trial and never publish the outcome? If so, it's also possible that people just kept running trials until they got a positive outcome.
> while the graphic used seems to imply a trend, it seems like more data would be needed to conclusively show that trend.
And you can tell that by looking at the graphic...? There's also a reference to a X2-test with p=0.0005 (meaning that there's a 0.05% chance that the trend you see is accidental and would disappear with more data).
I think you're right to be skeptical (correlation is not causation and so on...), but not that too little data is the problem here.
By too little data, I didn't mean that they should have collected more data, I meant that it would be nice to see a) other changes within the drug industry from the 90s and early 2000s that may or may not have contributed plotted on the same graph (not just a single line for this one rule as if that's the only event to affect clinical trials since the 70s), and b) more clinical trial outcome data into the future. More generally, I meant that it's hard to attribute historical events conclusively to specific cause-and-effect, because we can never rewind time in order to change one variable and observer its effects.
There's a huge amount of concern in the medical profession that lots of studies in the area are statistically compromised precisely because the authors gamed the statistics after the fact, or because improper metrics were used in the first place.
See these (kind of old but still quite appropriate) links:
It's this kind of work that led to pressure to pre-declare statistical objectives, and now we see the results. Biology is hard, medicine is hard, we must be quite humble about what we understand in these fields and therefore how easy it is to be misled by early promising results.
When experimenters are required to state their targets beforehand, they fail to meet them. If they don't have to state the outcomes beforehand, then its much easier to spin what they do find as beneficial!
(just skimming) I am guessing the y-axis is the observed expected value of the treatment (low being good) and the dots are re-colered to null if the sample size was too small to separate the (not shown) error bars around the dot from no effect.
The punchline is: once the researchers had to declare their statistical procedure before turning in the results most of the dots get colored "null." Meaning the strongest effect in the past was the ability to argue your way into a more favorable analysis.
The studies on the chart are for treatment of cardiovascular disease. The lower they are on the chart, the more they reduced the risk of whatever it was they were trying to prevent (the "primary outcome" for each study).
In other words, declaring your lottery number picks prior to the drawing results in fewer winners than being able to pick your numbers afterwards. Ergo, pre-declaring your picks causes lotteries to stop working.
* In 2000, a rule was passed that clinical trials must include their hypotheses (i.e. clinical outcomes) in their initial registration before the trial is conducted.
* From the graph, the number and proportion of clinical trials through NLHBI with positive outcomes (i.e. an improvement in some health condition) is far lower after 2000 than before.
It was previously easy for researchers to conduct a trial, and then find positive trends in the data to form positive outcomes by framing the data from the trial in a different manner, as if the trial were originally intended to test that positive outcome. This is a problem related to A Priori Probability, where you're essentially turning research, which could have been statistically applied to a larger population given some statistical confidence, into deductive reasoning which doesn't necessarily apply beyond the finite sample size [1] (also see @entee's comment and links for better explanation of why this is important).
This article is stating that the graph implies that, because there are fewer positive outcomes after this new rule in 2000 than before, most of the positive outcomes prior to 2000 (which likely lead to new FDA-approved drugs on the market) must have been illegitimate.
I would point out though, that at least from that graph, the last positive outcome shown prior to 2000 was in 1996, so it's not like it obviously continued right up until 2000 and then suddenly stopped. Even when that rule was imposed in 2000 (I'm not necessarily sure how the ruling was phased in), there was a 4-year gap since the last positive outcome before then, and I don't see another 4-year gap prior to that since before around 1979. I'm not saying I disagree with the article (obviously), but just that while the graphic used seems to imply a trend, it seems like more data would be needed to conclusively show that trend.
[1] http://www.investopedia.com/terms/a/apriori.asp