Best understood in context as a combination of medical humour, and a critique of evidence-based medicine. In this case pointing out the lack of controlled randomized trials of parachutes, and how this would affect the evaluation of the efficacy of parachutes as a life-saving medical intervention.
This must once have been true of seatbelts too, before all the experiments that were done with dummies and cadavers.
One could argue that similar experiments should be done with parachutes: Throwing dummies and cadavers out of aircraft onto various surfaces (rock, grass, woodland etc.)
Interesting engineering problem too (you need a robot to pull the ripcord, but then it needs to get out of the way so it does not interfere with the outcome by causing additional injuries, adding too much weight etc.)
The problem here is that we have a condition ("jumping out of aeroplanes") where, without an intervention ("give the patient a parachute") the consequences are almost certainly fatal (yes, a handful of people have survived falls from altitude without parachutes: but a fatal outcome is sufficiently certain that defenestration has historically been used as a means of execution).
It's a major ethics no-no to expose healthy patients to potentially-fatal environments. It's also a major ethics no-no to withhold known life-saving treatment in the name of continuing a trial. So we can't contrive a proper randomized double-blind placebo-controlled test of the life-saving efficacy of parachutes, where we push people out of planes repeatedly to see how many survive without a parachute vs. with a parachute.
We can't test parachutes any more than we can contrive an evidence-based test of the efficacy of being-rescued-from-burning-buildings-by-firefighters, by arranging for a city's fire department to randomly not rescue half the victims of house fires, so that we can subsequently compare their survival rate with the survival rate of those who were rescued.
Basically, evidence-based assessment of medical treatments breaks down when it hits the edge condition defined by the patients automatically dying if not treated. At best we can compare different types of parachute, or different types of emergency tratment, and see which has a lower associated mortality rate. But we can't, ethically, compare treatment/no-treatment if there's a high likelihood of no-treatment resulting in death.
But the thing is that you can use proxies to study this to an extent.
Amongst rock climbers there is a naturally occurring population of free climbers, and it is possible to note the type and severity of injuries between different types of rock climbers (although i'm guessing your n is going to be low).
The safety of parachutes themselves is an interesting question because they're not used as a preventative mechanism, I think that's the primary difference between them and climbing ropes/harnesses/pitons.
Edit: I just noticed that I didn't address the randomized component. I'd still argue that evidence based medicine can actually include arguments that aren't randomized trials. That said, randomized trials are still one of the strongest sort of argument that can be made. As noted above, seatbelts are another sort of device that I don't think we need randomized trials to test the efficacy of.
p.s. Just finished (and enjoyed) the Apocalypse Codex. The Laundry Files is one of my favorite series of books as someone who's always believed that empiricism and existentialism are reconcilable. Please write more :)
http://www.antipope.org/charlie/blog-static/2012/10/still-un... mentions "Neptune's Brood", space-opera set 5k years after "Saturn's Children" but it sounds like it may not actually be coming out until next year, along with three volumes of Merchant Princes (the existing books, in omnibus editions with minor changes) and maybe a UK edition of "The rapture of the nerds".
"Neptune's Brood" comes out in July 2013. (It's a high-concept space opera and a meditation on the 2007-08 liquidity crisis. Also, nominally, a sequel to 2008's Hugo-nominated "Saturn's Children".) "The Rhesus Chart" is in the pipeline for July 2014. The other stuff is all, effectively, reprints.
> It's also a major ethics no-no to withhold known life-saving treatment in the name of continuing a trial.
The problem with this, of course, is that under certain circumstances terminating trials early for ethical reasons causes them to conclude that the treatment works when it doesn't.
Actually, pretty much all the apparently well thought-out ethical restrictions can lead to disaster, and there's even a set of studies that manages to accidentally demonstrate all of them - the randomized controlled trials into circumcision as a form of HIV prevention in Africa. In addition to rather unwisely terminating the study early, they were also careful to use sterile surgical instruments, made sure to inform the participants that they'd still need to use condoms and give them a supply of condoms and education on how to use them, and ensured that the newly-circumcised individuals refrained from sex whilst their wounds healed. All perfectly reasonable ethical requirements, and not ones we'd want to do away with.
Unfortunately, the actual interventions that have been made based on this study have none of those features. Men are being circumcised with bloody instruments, under the belief that this means they don't have to use condoms anymore, and any funding that's going towards circumcision is funding that's not being used to supply condoms and condom education. The difference between what's actually happening and what can ethically be studied is huge and most likely fatal. I suspect the whole thing's going to end up turning out to be another ill-conceived Western intervention that sows distrust a few years down the line.
Those are called static lines, but your parachute is open within seconds of leaving the aircraft. That doesn't allow you any freefall time in an "experiment."
Skydivers have had Automatic Opening Devices available for years, and some people use them. They're set to open at a specific altitude if you haven't already opened. The idea is that you may have lost consciousness from a medical problem or collision with another skydiver.
Aside: I used to skydive way back in the day. At the time the skydiving community was making the transition from round parachutes for students to square parachutes(the question had already been settled for experienced skydivers). My home drop zone did some testing with a few models of square parachute and a weighted dummy. They called it Elmer Thud.
We used to throw a dummy from the plane arranged to crash behind the hanger where we taught the skydiving intro class. Students would get horrified until a "rough up" skydiver (who hid behind the hanger earlier) came out limping and cursing " my f%$%# chute did not opened again!" :)
TL/DR: dark skydiver humor.
Skydivers been using automatic safety openers for a very long time (the 80's at least). We basically install them on our reserve chute so if we pass below opening altitude at a fast rate of descent it automatically trigger the emergency chute.
It's saved many, many lives that I can personally recall.
Similar to how in computing the 80s is like a millennia ago, skydiving has changed dramatically since the 80s in regards to both equipment and the skills on which skydivers concentrate.
You never see round parachutes, rigs without reliable Automatic Openers on the reserve, dacron (a material used for lines). The parachutes are half the size or less for average jumpers. As for behavior, people intentionally increase their speeds on landing for more fun, wingsuiting is fairly common, and there are sub-disciplines of freefall were people spend the entire portion of the skydive upside down.
> This must once have been true of seatbelts too, before all the experiments that were done with dummies and cadavers.
Surprisingly enough, it was true of airbags after those experiments too: in the late '80s and early '90s, it was noticed that the airbag fatality rate was shockingly high compared to what had been projected, so they went back to the drawing boards and revised the designs. A quote from a RAND study on autonomous cars:
> This tension produced "a standoff between airbag proponents and the automakers that resulted in contentious debates, several court cases, and very few airbags" (Wetmore, 2004, p. 391). In 1984, the US DOT passed a ruling requiring vehicles manufactured after 1990 to be equipped with some type of passive restraint system (e.g., air bags or automatic seat belts) (Wetmore, 2004); in 1991, this regulation was amended to require air bags in particular in all automobiles by 1999 (Pub. L. No. 102-240). The mandatory performance standards in the FMVSS further required air bags to protect an unbelted adult male passenger in a head-on, 30 mph crash. Additionally, by 1990, the situation had changed dramatically, and air bags were being installed in millions of cars. Wetmore attributes this development to three factors: First, technology had advanced to enable air-bag deployment with high reliability; second, public attitude shifted, and safety features became important factors for consumers; and, third, air bags were no longer being promoted as replacements but as supplements to seat belts, which resulted in a sharing of responsibility between manufacturers and passengers and lessened manufacturers' potential liability (Wetmore, 2004). While air bags have certainly saved many lives, they have not lived up to original expectations: In 1977, NHTSA estimated that air bags would save on the order of 9,000 lives per year and based its regulations on these expectations (Thompson, Segui-Gomez, and Graham, 2002). Today, by contrast, NHTSA calculates that air bags saved 8,369 lives in the 14 years between 1987 and 2001 (Glassbrenner, undated). Simultaneously, however, it has become evident that air bags pose a risk to many passengers, particularly smaller passengers, such as women of small stature, the elderly, and children. NHTSA (2008a) determined that 291 deaths were caused by air bags between 1990 and July 2008, primarily due to the extreme force that is necessary to meet the performance standard of protecting the unbelted adult male passenger. Houston and Richardson (2000) describe the strong reaction to these losses and a backlash against air bags, despite their benefits. The unintended consequences of air bags have led to technology developments and changes to standards and regulations. Between 1997 and 2000, NHTSA developed a number of interim solutions designed to reduce the risks of air bags, including on-off switches and deployment with less force (Ho, 2006). Simultaneously, safer air bags, called advanced air bags, were developed that deploy with a force tailored to the occupant by taking into account the seat position, belt usage, occupant weight, and other factors. In 2000, NHTSA mandated that the introduction of these advanced air bags begin in 2003 and that, by 2006, every new passenger vehicle would include these safety measures (NHTSA, 2000).
Contributors GCSS had the original idea. JPP tried to talk him out of it. JPP did the first literature search but GCSS lost it. GCSS drafted the manuscript but JPP deleted all the best jokes. GCSS is the guarantor, and JPP says it serves him right.
This is kind of making me froth: of course parachutes are tested. Lots and lots of R&D time has gone into developing better crew/passenger escape system for aircraft.
The authors are simply whining about evidence-based medicine, which is now finally being pushed after a realisation that a lot of the medical interventions we use are based on gut feeling, hearsay, or ancient, tiny, badly-implemented trials.
Human beings are extremely good at seeing patterns where there are none. We need good, solid stats to keep us grounded in reality.
Yeah, The problem is always that very few, very few indeed, medical procedure rises to the prior that parachutes have... but anti-EBM people somehow have a surprisingly long list of such procedures, a list that rather resembles the list they would've made before EBM began insisting on RCTs.
I like EBM. I would love to live in a world in which it is practiced properly. One of the biggest problems with EBM, though, is that what often is advertised EBM, actually isn't.
My ex was a medical doctor, and for fun I used to read NEMJ and BMJ articles through her subscription. If you trace through references, you often find a (reasonably rigorous) study with reasonable effect observed on a group 40 norwegian women aged 40-50; which in a later article referencing it is assumed to apply to all women in that age group; and through the years is assumed to apply to all women in all age groups. And that's the part that is "easy" to trace (except no one ever does, and it's not really easy)
Additionally, the statistical evidence is inherently weaker than is presented (in a way which no one can actually evaluate) due to the lack of accounting for negative results and unreported trials.
Also, accepted login in EBM is not acceptable from a math perspective. Seth Robert's recent blog post gives a striking example: http://blog.sethroberts.net/2012/10/11/jama-jumps-to-conclus... ; Tell someone about YOUR (n=1, but better controlled than most medical tests!) experiment with Vitamin D, and they'll likely counter with "Oh, but that has been refuted. It's just placebo". That's a big percentage of how EBM is used in practice. And that's the part I don't like.
> Seth Robert's recent blog post gives a striking example: http://blog.sethroberts.net/2012/10/11/jama-jumps-to-conclus.... ; Tell someone about YOUR (n=1, but better controlled than most medical tests!) experiment with Vitamin D, and they'll likely counter with "Oh, but that has been refuted. It's just placebo". That's a big percentage of how EBM is used in practice.
Actually, it's funny that you bring that up. Yes, my own vitamin D self-experiments were more rigorous than Roberts's vitamin D experiments/anecdata, and both of my self-experiments used blinding precisely to avoid the placebo refutation, so anyone trying to refute them so simplistically would be engaged in poor reasoning; but Roberts is not innocent himself of poor reasoning himself!
I pointed out in his previous post on that study that his criticism makes no sense since the injection happened once a month, which implies the following dilemma (if we believe his criticism): the effect from bad timing should either be undetectable due to contaminating only 1/31 of the data (1 day out of each month), or if the negative effect is persistent over many days, renders every single anecdote & self-experiment (including mine) completely unreliable trash since none of them were expecting an effect which contaminates so many days. So either his criticism is wrong or his entire collection of data is worthless. (Besides that, plenty of vitamin D studies have shown benefits while completely ignoring administration time. What's sauce for the goose is sauce for the gander.)
So maybe EBM is abused in practice... but I don't see your examples showing this.
> So maybe EBM is abused in practice... but I don't see your examples showing this.
I was giving a (paraphrased) example from my history - it was 10 years ago, and I didn't keep my notes.
But Seth's example is typical: A result (negative in this case) applicable very narrowly is assumed to apply very broadly (By a top practictioner in the field, and his view is unlikely to be rejected; most readers will take his conclusion without trying to confirm or criticize the rationale).
> the effect from bad timing should either be undetectable due to contaminating only 1/31 of the data (1 day out of each month), or if the negative effect is persistent over many days, renders every single anecdote & self-experiment (including mine) completely unreliable trash since none of them were expecting an effect which contaminates so many days.
I don't follow you; In this experiment, they gave 100,000IU once a month; meaning that there must be a sharp rise in the beginning, followed by a decay which we believe (from other experiments) to happen slowly over a month.
If the effects are modulated by the _delta_ in vitamin D, rather than the _level_ (e.g. if there's a phase-locked-loop somewhere), then Seth's reasoning applies perfectly well: In the JAMA experiment, there is a (very small) constant negative delta, and nothing for a biological PLL synthesizer to "lock on" to - whereas with your experiment (and other anecdata collected by Seth), the sleep cycle can be locked onto the Vitamin D delta (with 12 hour delay), and all would make sense.
> so anyone trying to refute them so simplistically would be engaged in poor reasoning;
Yes, that's exactly my point. Most people who think they follow EBM are actually engaged in poor reasoning, and even worse meta-reasoning, as in: "The fact that other people did not notice that this evidence is inadequte, is evidence that YOUR reasoning is flawed". Yes, more than one person with an MD/PhD told me that if what I claim about how B12 is produced is right, they would have learned about it in school; even after they relented and looked it up themselves!
Furthermore, most placebo-controlled double-blinded experiments are not actually blinded: the substance or procedure being tested is known to have an effect (e.g. dry mouth) but it is not known if it has the desired effect. The placebo is known to have no effect (e.g. sugar pills). The patient therefore knows when they got the real thing (though they're still in the dark if they got the placebo).
The only way this is ever controlled for (if at all), is by comparing to an accepted substance/procedure used for the same purpose - which might, itself, be subject to the same kind of bias.
So, to cut a long story short ... The ideal of EBM is nice. The practice is very, very far away, but is assumed by most EBM proponents to have all the properties of the ideal.
> In the JAMA experiment, there is a (very small) constant negative delta, and nothing for a biological PLL synthesizer to "lock on"
Exactly. With nothing to lock onto, there should be no effect or a 'very small' effect since people will continue their normal circadian rhythms and light exposure levels which zeitgebers will, just as usual, adjust things daily - if that delta over the entire month even matters, which it probably doesn't since it's fat-soluble and should be released gradually at need from fat stores.
To say that there should be a strong benefit to colds and this tiny non-existent effect explains why we don't see it (without also explaining why this doesn't eliminate all other benefits observed from vitamin D, since use of injections is common in vitamin D experiments) is just totally obvious special pleading by Roberts.
(And yes, the relevant factor cannot be the level, because that is flagrantly inconsistent with my self-experiments where the level was kept constant and only the timing was varied.)
> Furthermore, most placebo-controlled double-blinded experiments are not actually blinded: the substance or procedure being tested is known to have an effect (e.g. dry mouth) but it is not known if it has the desired effect. The placebo is known to have no effect (e.g. sugar pills). The patient therefore knows when they got the real thing (though they're still in the dark if they got the placebo).
I guess I can't support 'most'. It did happen in two out of three experiments for which I actually had intimate details.
But this is one of those things that will never get a trustworthy citation, and yet is probable (replace "most" with "significantly many" to increase probability according to your own probability measure).
A drug tested today on humans has, with high probability, been shown to be safe, but most importantly, effective, in an animal model - or is already known to be effective and safe in humans for another use.
Note the term "effective". That means that it is known to have a measurable effect. For many drugs, it is therefore probable that the (human) subject receiving the treatment can tell that it isn't placebo.
Specifically, quite a few psychiatric drugs are known to cause obesity, compulsive thoughts and lactation (even in males). Niacin is known to cause flushing (at lower levels, it is not externally visible, but the sensation is still there). Viagra was discovered during an attempt to treat hypertension. I'm sure the subjects could tell it from the placebo. Similarly for Minoxidil.
To actually assess how prevalent this is, you need at least access to notes made during the experiment, or worse - you need to do measurements and stats that weren't made in the first place (and which making may only make the results less promising; which is a problem in for-profit development)
"Contributors GCSS had the original idea. JPP tried to talk him out of it. JPP did the first literature search but GCSS lost it. GCSS drafted the manuscript but JPP deleted all the best jokes. GCSS is the guarantor, and JPP says it serves him right."
Maybe I'm missing the point of this entire thread and don't truly grasp the argument for EBM vs. correlation, but can someone explain why a statistically significant correlation for this kind of efficacy is not evidence in and of itself?
Those aren't prospective randomized controlled trials. They're retrospective observational studies. They are evidence of efficacy, no doubt, but not particularly convincing evidence for all the normal reasons observational studies are often unconvincing, but maybe even moreso in light of some other contradictory results in the literature as well as the general concept of risk compensation (i.e., it's not like a parachute where it's very clearly obvious that it works).
i believe (from memory) that the argument is that wearing helmets deters people from cycling which has a cost in terms of deaths from conditions related to lack of exercise.
(and that's a broader argument than the one covered by the paper you link to)
I see the point they're trying to make, but randomized controlled trials would still be valuable to determine the relative effectiveness of different parachute designs.
Some of the comments made by HN participants who posted during the first hour this 2003 article was posted bring up the main point: clinical trials should be informed by prior probabilities. That's why some medical researchers, while not saying that clinical trials are a bad idea, identify the best practice as "science-based medicine" (medicine that takes into account prior probabilities and basic principles of science when evaluating clinical trials) rather than "evidence-based medicine" (which many observers take to be a designation for relying on clinical trials to evaluate treatments). The "About" page on the Science-Based Medicine group blog site puts it well:
"Good science is the best and only way to determine which treatments and products are truly safe and effective. That idea is already formalized in a movement known as evidence-based medicine (EBM). EBM is a vital and positive influence on the practice of medicine, but it has limitations and problems in practice: it often overemphasizes the value of evidence from clinical trials alone, with some unintended consequences, such as taxpayer dollars spent on 'more research' of questionable value. The idea of SBM is not to compete with EBM, but a call to enhance it with a broader view: to answer the question 'what works?' we must give more importance to our cumulative scientific knowledge from all relevant disciplines."
The previous comments posted here mentioning what we can know about falling human bodies from first principles of how free fall in the earth's near gravitational field works, or by experiments with dummies including instruments, or by historical experience with aircraft disasters, correctly point out that randomized controlled trials should be designed with all the relevant science in mind. Further, clinical trials should be designed both to gain knowledge of the effectiveness of varying treatments and to minimize risk to human subjects of treatments (jumping out of an airplane without a parachute, for example, or using acupuncture, for another example) with no strong evidence of effectiveness.
See my all-time favorite link to share in HN comments, LISP hacker and rocket scientist (and now director of research at Google) Peter Norvig's article "Warning Signs in Experimental Design and Interpretation"
AFTER EDIT: In answer to the question, "Where is the evidence that EBM or SBM helps?", the evidence is all around you, all over the world. I've lived in more than one country, and I have had access to the research library of a major university health sciences center since I was in high school, and the incremental improvements in human lifespan at all ages, young and old, and the reduction in disease burden from all kinds of diseases
are an outcome caused in large part by better understanding of how to prevent or treat disease, an understanding that comes about from science-based medicine.
about systematic reviews of acupuncture research in China illustrates the drawbacks of doing anything but science-based research on human disease prevention and treatment. People can imagine that a lot of speculative treatments work, but the way to demonstrate that one treatment or another works is to investigate it carefully with the underlying science in mind.
Peter Norvig may be an expert on scientific methodology, but when it comes to comedy he's not the first source I would turn to. Steve Martin, an authority in the field, said "Comedy is not pretty"[1] and that clearly overrides Norvig in the matter of this article.
George Carlin makes many excellent points on the subject of science, but most relevant here is when he raises the question of which party is really the subject of an experiment[2].
"The truth is, Pavlov's dog trained Pavlov to ring this bell just before the dog salivated." -George Carlin
Perhaps the real study here is to test how many people will take a joke a little too seriously just because it looks like a scientific paper.
You mean where's the evidence that evidence based medicine works?
I could produce some evidence but I'm afraid that then you'd ask me where's the evidence that my evidence actually supports the hypothesis that evidence based medicine actually helps.
Lots of old people. Though that might just be down to the decrease in witch burnings as a method of driving out plague. Hard to tell for sure unless we count them, but there does seem to be loads of them about.
If Monty Python were allowed a medical license....
"Contributors GCSS had the original idea. JPP tried to talk him out of it. JPP did the first literature search but GCSS lost it. GCSS drafted the manuscript but JPP deleted all the best jokes. GCSS is the guarantor, and JPP says it serves him right."
It's not really wrong. It makes the correct point that randomized controlled trials are only one kind of useful knowledge. It's possible to know stuff that can't be tested that way, so blind insistence on EBM is likely to throw out some babies along with all that bathwater.
Seth Roberts often makes similar points in his blog - people running experiments on themselves of size N=1 can often discover stuff that is useful, even vital, but can't be discovered or demonstrated in a big double-blind study.
Imagine that we had no scruples and were willing to throw people out of aircrafts with dummy or with real parachutes. The ones who had parachutes would still notice when the parachute opened and slowed their fall. So we can't rule out the placebo effect! :-)