"all you have to do is read the methods section in the paper and follow the instructions."
I wish science was that simple. The methods section only contains variables the authors think worth controlling, and in reality you never know, and the authors never know.
Secondly, I wish people say: "I replicated the methods and got a solid negative result" instead of "I can't replicate this experiment". Because most of the time, when you are doing an experiment you never done, you just fuck it up.
Here is an example: we are studying memory using mice. Mice don't remember that well if they are anxious. Here are variables we have to take care of to keep the mice happy, but they are never going to go to the methods section:
Make sure the animal facility haven't cleaned their cages.
But make sure the cage is otherwise relative clean.
Make sure they don't fight each other.
Make sure the (usually false) fire alarm hasn't sound for 24 hours.
Make sure the guy who was installing microscope upstairs has finished producing noise.
Make sure there is no irrelevant people talking/laughing loudly outside the behaviour space.
Make sure the finicky equipment works.
Make sure the animals love you.
The list can go on.
Because if one of this happens, the animals are anxious, then they don't remember, and you got a negative result which have nothing to do with your experiment (although you may not notice this). That's why if your lab just start to do something you haven't done for years, you fail. And replicating other people's experiment is hard.
A positive control would help to make sure your negative result is real, but for some experiments a good positive control can be a luxury.
A lot of the commenters are trying to come up with technical solutions to the issue. It belies their incomplete study of the 'softer' sciences and the difficulty of biological studies. Tech solutions won't work, bureaucratic solutions wont work, more data wont work.
Guys, the bio side of things is incredibly complicated and trying to set controls that are achievable in your time and budget is the heart of these fields. The thing you are trying to study in the bio side of science is actually alive and trying to study you right on back. If you are going to kill the thingys, they really do want to kill you too. Look at this diagram of mitochondrial phospholipid gene/protein interactions (https://www.frontiersin.org/files/Articles/128067/fgene-06-0...). That is very complicated and that is for one of the best studied organelles in the Animalia family. There are an uncountable and evolving set of other proteins that then interact with that diagram in different ways depending on the cell-type, species, and developmental history of the organelle (to start with). All of which are totally unknown to you and will likely forever remain unknown to you up until your death. Hell, we are still figuring out the shapes of organs in our own bodies. People that have studied areas for decades spending untold millions of dollars on some of the most central questions of life have essentially nothing to say for themselves and the money spent('We dream because we get sleepy'). Trying to tease out a system that has been evolving for 4.5 billion years and makes a new generation (on average) every 20 minutes is going to be just insanely difficult.
This stuff is hard, the fields know it, and we don't believe anything that someone else says, let alone what our 'facts' and experiments tell us. But we solider on because we love it and because we want to help the world.
I think there's also a lot of confusion about what the scientific literature should be.
Outsiders think of journals as pre-packaged nuggets of science fact, with conclusions that you can read and trust.
Scientists who publish in the journals view them as a way to communicate a body of work of "I did this, here is the result I saw. I think it might mean this."
The difference between these two views can not be understated. Each group wants the literature to be useful to them, so that they can use it, and that's understandable.
For most areas of science, especially in the early days of that science, it's absolutely essential that scientist's view be allowed to persist. Is it better to share early, or to wait to publish until you've tried all possible things that could potentially go wrong? I think the answer is obviously that you share the early data, and what you think it means, even if you may be wrong about it.
If the goal is to advance knowledge as quickly as possible, I think that the scientist's view is probably a better idea of what a journal needs to be than what the outsiders' view is. In some fields, like fMRI studies, the field is realizing that they may need to go about things differently. And that means that a lot of the interpretations that were published earlier are incorrect. But that process of incorrect interpretation to corrected interpretation is an essential part of science.
Scientists and academic institutions add to the confusion with the way they promote their work. Nobody puts language like "I did this, here is the result I saw. I think it might mean this" in a press release. It's always "Breakthrough research! Is the end of cancer in sight??"
A lot of this is because of the incentives that result from grant-based funding of science. Your ability to continue to do science is contingent on your ability to find someone to pay for it; it's much easier to convince someone to pay for it if you can point to headlines that say "Breakthrough research! Is the end of cancer in sight?"
The old system of "gentleman science" (where only independently-wealthy heirs did science, as a hobby) had a lot of problems and wouldn't really be workable today, but one thing it had going for it was that the scientist could count on the funding being there tomorrow, and so had much less of an incentive to overstate their work.
Absolutely true, though I know of very few scientists that have been able to tamp down the game of telephone that the university's press office starts.
Probably, the real problem is that "the scientific literature should be". It's not the right question to discuss today, because scientific literature is just a brief approximation of scientific knowledge and this is not what we need. Whole modern scientific system is outdated and has to be upgraded: journals, citation indexes, degrees should go away and be replaced. Science should become more formal and more digital, so that it will be easier to find and validate the results. There should be databases, search engines and digital signatures of the raw data and chains of proofs. Counting citations should be replaced with counting chains of proof produced by the scientist and included in others' works. Reliability of proof and contribution to ratings should be based on independent confirmations, which should be counted too - replication of new results should be considered an achievement too. Journals may remain, but not as a primary way to exchange scientific information - more likely, they should become the portals to the data which will link the results in databases to their human-readable interpretations.
Respectfully, I think you're missing the complications outlined in the rest of the thread. The difficulty in the biosciences and even in chemistry and some physics is that they can't really be formalized in the way that is necessary to construct these "chains of proof".
The scientific literature is an ongoing conversation anchored by rigorous experimental facts and data. But rigorous doesn't mean it's clean like a mathematical proof. In fact, most science approaches its "proofs" in quite a different way than math. For example, as far as science would be concerned P!=NP in complexity theory. We've done the experiment many times, tried different things, it's pretty much true. But it's still not mathematically proven because there isn't a formal proof.
That's not to say it's invalid to expect more rigor, or that we wouldn't all love to have "chains of proof" and databases and signatures for data etc. It's that it's simply not practical given how noisy and complex biological systems are. In contrast to math, you pretty much never know the full complement of objects/chemicals/parameters in your experimental space. You try to do the right controls to eliminate the confounding variables, but you're still never fully in control of all the nobs and switches in your system. That's why usually you need multiple different experiments tackling a problem from multiple different approaches for a result to be convincing.
Formalized systems would be great, but I don't think we're even close to understanding how to properly formalize all of those difficulties and variables in a useful way. And it may not even be possible.
Whilst the experimental subjects and data collection are inherently fraught with difficulty, there's still a LOT of low-hanging fruit regarding things like automation. Many scientists use computers to write up results, to store data and perform calculations, but there's often a lot of manual, undocumented work which could easily be scripted to help those re-running the experiment. For example, running some program to produce a figure, without documenting what options were used; providing a CSV of data, without the formulas used for the aggregate statistics; relying on a human to know that the data for "fig1.png" comes from "out.old-Restored_Backup_2015~"; etc.
Such scripting is a step on the path to formalising methods. They'd help those who just want to see the same results; those who want to perform the same analysis using some different data; those who want to investigate the methods used, looking for errors or maybe doing a survey of the use of statistics in the field; those who want a baseline from which to more thoroughly reproduce the experiment/effect; etc.
The parent's list of mouse-frighteners reminds me of the push for checklists in surgery, to prevent things like equipment being left inside patients. Whilst such lists are too verbose for a methods section (it would suffice to say e.g. "Care was taken to ensure the animals were relaxed."), there's no reason the analysis scripts can't prompt the user for such ad hoc conditions, e.g. "Measurements should be taken from relaxed animals. Did any alarms sound in the previous 24 hours? y/N/?", "Were the enclosures relatively clean? Y/n/?", "Were the enclosures cleaned out in the previous 24 hours? y/N/?", etc. with output messages like "Warning: Your answers indicate that the animals may not have been relaxed during measurements. If the following results aren't satisfactory, consider ..." or "Based on your answers, the animals appear to be in a relaxed state. If you discover this was not the case, we would appreciate if you update the file 'checklist.json' and send your changes to 'experimentABC@some-curator.org'. More detailed instructions can be found in the file 'CONTRIBUTING.txt'"
I like the idea, but who the hell is ever going to go through all of that? Yes, you made some checklist, great. But no other lab is going to go through all of that. And in your field, if you are very lucky, you may have just 1 other lab doing anything like what you are doing. It would be a checklist just for yourself/lab, so why bother recording any of it? Yes, do it, fine, but how long should you store those records that will never be seen, even by yourself? Why in god's name would you waste those hours/days just going over recordings of you watching a mouse/cell/thingy to make sure of some uncountable number of little things did/did not happen? If you need that level of detail, then you designed your experiment wrong and the results are just going to swamped in noise anyway. You are trying, then, to fish out significant results from your data, the exact wrong way to run an experiment. Just design a better trial, there is no need to generate even more confusing data that has a 1/20 chance of being significant.
The checklist is not required to be on such level of detail. It just has to exist and it has to be generic enough. It's interesting to see here example with fire alarm: to me existence of such factors is the smoking gun of potential improvements to the experimental environment. Why not excluding ALL stress factors by designing something like sound-proof cage? Needs extra budget? Probably, but how about some another unaccounted noise that will ruin the experiment? This gives us an idea of better checklist: ensure that experiment provides stressless environment by eliminating sound, vibration, smells etc.
> who the hell is ever going to go through all of that?
It's not particularly onerous, considering the sorts of things many scientists already go through, e.g. regarding contamination, safety, reducing error, etc.
> Yes, you made some checklist, great. But no other lab is going to go through all of that. And in your field, if you are very lucky, you may have just 1 other lab doing anything like what you are doing. It would be a checklist just for yourself/lab, so why bother recording any of it?
Why bother writing any methods section? Why bother writing in lab books? I wasn't suggesting "do all of these things"; rather "these are factors which could influence the result; try controlling them if possible".
> Yes, do it, fine, but how long should you store those records that will never be seen, even by yourself?
They would be part of the published scientific record, with a DOI cited by the subsequent papers; presumably stored in the same archive as the data, and hence subject to the same storage practices. That's assuming your data is already being published to repositories for long-term archive; if not, that's a more glaring problem to fix first, not least because some funding agencies are starting to require it.
> Why in god's name would you waste those hours/days just going over recordings of you watching a mouse/cell/thingy to make sure of some uncountable number of little things did/did not happen?
I don't know what you mean by this. A checklist is something to follow as you're performing the steps. If it's being filled in afterwards, there should be a "don't know" option (which I indicated with "?") for when the answers aren't to hand.
I imagine it would be easy to have a git-like storage system for this information, where reproduction experiments would be a branch without the actual measurement data.
> a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.
While this is interesting to speculate about, perhaps it would be best to start with something like the machine learning literature, where everything is already run computationally, and those in the field have the skills to easily scratch their own itch to improve the system so that it works for them.
Even in machine learning, how difficult would it be to get that field to adopt a unified experiment running system? It sounds like a huge engineering project that would have to adapt to all sorts of computational systems. All sorts of batch systems, all sorts of hadoop or hadoop like systems. And that's going to be far easier than handling wet lab stuff.
I think that the lack of something like this in ML shows that there's enough overhead that it would impede day-to-day working conditions. Or maybe it just hasn't been invented yet in the right form. There are loads and loads of workflow systems for batch computation, but I've never encountered one that I like.
In genomics, one of the more popular tools for that is called Galaxy. But even here, I would argue that the ML community is much better situated to develop and enforce use of such a system than genomics.
I agree that computational fields are more well-suited to spearhead such approaches, but I don't think machine learning is a good example. ML researchers are constantly pushing at the frontiers of what our current technology can do; consider that a big factor in neural networks coming back into fashion was the ability to throw GPUs at them. The choice of hardware can make a huge difference in outcomes, and some researchers are even using their own hardware (the work being done on half-precision floats comes to mind); any slight overhead will get amplified due to the massive amount of work to be computed; and so on.
Maybe a field that's less dependent on resources would be a better fit. An example I'm familiar with is work on programming languages: typechecking a new logic on some tricky examples is something that should work on basically any machine; bechmarking a compiler optimisation may be trickier to reproduce in a portable way, but as long as it's spitting out comparison charts it doesn't really matter if the speedups differ across different hardware architectures.
When the use of computers is purely an administrative thing, e.g. filling out spreadsheets, drawing figures and rendering LaTeX (e.g. for some medical study), there's no compelling reason to avoid scripting the whole thing and keeping it in git.
I thought that statistics is the right tool to handle uncertainity? I had the impression that all the "soft" science is based on statistics. We can't prove that smoking kills people. But, given a certain confidance interval (or whatever equivalent measure for baysian statistics?), we can state that smoking is not a good idea if you want to live longer than mean expected years...
Sure, some unforseen event may interfere with your experiment. But statistics should account for that, shouldn't it? Just wondering..
Statistics is also the way to look at data in the "hard" sciences. Every measurement has error, and that's how you deal with it.
However it's not a simple thing that automatically combines different studies. It takes skilled application to understand how data connects, what's comparable, what's not, etc. Traditionally, "meta-analysis" is the sub-field of statistics that combines studies. But that only combines extremely simple studies, such as high-controlled clinical trials. It's inappropriate for the complex type of data that appears in a typical molecular biology paper is a chain of lots of different types of experimental setups.
Those who don't know the body of stats, trying to reason about application, is a lot like an MBA trying to reason about software architecture. The devil is in the details, and the details are absolutely 100% important with application of statistics to data.
Amen. You can google the controversy around the StatCheck program to really dive into why stats and their applications are beyond lies and damned-lies in their falsehoods (because you can prove the lie is right). Doing something as simple as smoking causes cancer is a very very simple experiment to interpret (hard to fund/preform though). It's the pernicious little studies on 8 cells total that get messy. Preforming a lot of experiments is hard.
I knew of a student that graduated with only 8 cells of data out of his 7 years in grad school. That may sound like a small amount, but to his committee, it was a very impressive number. He (very very) basically sliced up adult rodent brains and then used super tiny glass pipettes to poke the insides of certain cells. He chemically altered these cells' insides in a hopefully intact network of neurons, shocked the cells, and then recorded the activity of other cells in the network using the same techniques. Then he preserved and stained the little brain slice so he could confirm his results anatomically. From start to finish, it took him 13 hours total, no lunch or restroom breaks, every day, for 7 years. He got 8 confirmable cells worth of recordings total.
That is a hard experiment. But due to his efforts in adding evidence we now suspect that most adult hearing loss is not due to loss of cells in the ear, but in the coordination of signals to the brain and their timing mis-match. It is not much, bu it adds to the evidence and will for sure help out people someday.
To add, he is now a beer brewer in Bavaria and quit science. This shit takes sacrifice man.
> I had the impression that all the "soft" science is based on statistics.
In medicine, case studies are often used for low n issues. There are too many variables for meaningful statistics to be pulled out, but a "this patient had X, we did Y, Z happened" is still a way to pass on observational information. It's recognised that case studies aren't ideal, but it's still better than not passing on information at all.
Everything can be formalized, even uncertainties in knowledge about experimental environment (e.g. by making a statement that unaccounted parameters do not influence the result - indeed, this will be challenged and proof requested and things like fire alarm affecting mice behavior should be reliably excluded). Something that cannot be formalized, cannot be proven or falsified and thus is not a science. The only difficulty may be to build the necessary apparatus, but that's doable and that's the way to fix the science.
Umm, no. Look at Godel's Incompleteness Theorem. You can prove that you will always be able to make a paradox in any formalized system of logic; at least, under our current understanding of logic. Expanding that (Godel, Escher, Bach by Hoffseader goes into it well) you then can say that any theory of the universe must have holes in it and any machine or system that attempts to formalize the observations will always come up with paradoxes. You are right, you can formalize everything (maybe, jury is still out on that, but I think so), but at the risk of then making paradoxes in the system.
I'm aware of this theorem and it has nothing to do with scientific method, it only gives us an idea of possible results of research and it does not tell us that you cannot formalize life sciences or chemistry. Indeed, there are theories that cannot be proven, but they are itself subject to formalization and research.
> Guys, the bio side of things is incredibly complicated
I always find it amusing when physicists talk about how mind-blowing it is that at the quantum level, things aren't entirely predictable. Over in biology, that's the starting point for everything rather than the final frontier - you don't need ridiculously expensive tools to get to the point where you're finding unpredictable stuff.
(trained as a chemist, grad school in biostats & genetics, now mostly design experiments & clinical trials... I have physics envy, except I don't envy their funding models!)
The double slit experiment is purely random, not unknowable. There is a large difference. In Bio, we will some day conceivably know enough to get to the point where we can build up stat models like the double slit experiment. I'd say that the double slit experiment is still leaps and bounds a better starting point than anything we have in bio and only took ~2 decades to finally parse out. Bio has been chugging along since Watson for ~8 decades before we finally got CRISPR and could really do anything about DNA.
Cre-Lox systems predate CRISPR/Cas9 or CRISPR/Cfp1 approaches by quite some time. If one exists for the system you want to study, they also tend to work better. (floxing the original mouse is, however, substantially harder)
I don't think it's entirely accurate to say that conditional editing of DNA is a new thing. The ready accessibility and combinatorial possibilities, yes, but for targeted conditional knockouts, floxing mice has been a thing for about 20 years now.
My PhD work is trying to address this by developing better ways for scientists to record and communicate their methods/protocols. Methods sections are NOT about providing the information needed to reproduce the work, often even the supplemental information is insufficient. They are a best guess at what needs to be done, and the methods section is often massively condensed due to editor demands and cites other papers that also don't contain the relevant information because some aspect of the method changed between papers. (I could go on and on about this.)
I spent 7 years doing experimental biology (from bacteria to monkeys) and trying to replicate someone else's techniques from their papers was always a complete nightmare. Every experimentalist I talk to about this relates the same experience -- sometimes unprompted. Senior faculty tell a slightly different story, that they can't interpret data of someone who has left the lab, but it is the same issue. We must address this, we have no choice, we cannot continue has we have for the past 70+ years, the apprenticeship system does not scale for producing communicable/replicable results (though it is still the best way to train someone to actually do something).
EDIT: An addendum. This stuff is hard even when you assume that all science is done in good faith. That said, malicious or fraudulent behaviour is much harder to hide when you have symbolic documentation/specification of what you claim you are doing, especially if they are integrated with data acquisition systems that sign their outputs. Still many ways around this, but post hoc tampering is hard if you publish a stream of git commit hashes publicly.
Wouldn't it be hard to achieve deep consistency between experiments, in so many labs around the world, with so different conditions/cultures/etc , and when the experimenters aren't experts in consistency , but in their science ?
Wouldn't it be better to use something like a cloud biology model - where you define experiments via code, CRO's compete on consistency(and efficiency and automation) and since they probably do a much larger volume of experiments than the regular lab, they would have stronger incentives to develop better processes and technologies ?
I work at a cloud bio lab. We run all of our experiments on automation, and all protocols must be defined in code. The latter is both the power and the difficulty -- when your protocol is defined in code it is explicit. However, writing code is both new and sometimes difficult for the scientists that we currently work with (molecular biology, drug discovery). I believe what we are doing is the right model. But it comes with this overhead of transitioning assays to code, so there is that against it. This is mostly just a matter of time though. Another nice thing about code is that you can't tweak it once it's running. You can define your execution and analysis up front to guard against playing with results down the road. Now that being said, there still needs to be a significant change to how research is funded and viewed by the public because pure tech solutions can't solve everything. Our tech can't decide what you pick yo research. It can't dish out grants to the truly important research. So it will take many angles to really solve any portion of this problem.
I do agree, it seems like right model, and will have a large impact.
Between automating labor, economies of scale in purchasing, and access to more efficient technology(like acoustic liquid handling) ,etc - isn't it just a matter of time before cloud biology becomes quite cost effective and combined with other benefits - it would be the only way that makes sense to do research, so funding will naturally go there?
Also - do you see a way to add the extreme versatility of the biology lab into a cloud service ?
> it would be the only way that makes sense to do research
There will certainly be more than just one way, although I hope cloud labs are the front runner. Also cloud and automated are two separate concepts. We do both, but there's no reason that you can't just do one or the other. The automation is critical for reproducibility for many reasons. But I think the cloud aspect is mostly helpful from a business perspective -- it makes it easier on everyone to get up and running on our system. But there are many in lab automation solutions that are helping fight the reproducibility crisis. And on the flip side, there are cloud labs that aren't automated.
> do you see a way to add the extreme versatility of the biology lab into a cloud service
We let you run any assay that can be executed on the set of the devices that we have in our automated lab. So in that sense, yes its very flexible. Also, there's no need to run your entire workflow in the cloud. You can do some at home, some in the cloud. Some people even string together multiple cloud services into a workflow. See https://www.youtube.com/watch?v=bIQ-fi3KoDg&t=1682s
That being said, biology labs can be crazy places. Part of what we do is put constraints on what can be encoded in each protocol to reduce the number of hidden variables. Every parameter that counts must be encoded in the protocol, because once you hit "go" on the protocol, it could run possibly on any number of different devices each time it runs. The only constant is that the exact instructions specified in the protocol will be run on the correct device set.
1. Yes, but the idea would be that if you provide a way to communicate the variables that actually matter for consistency then you can increase the robustness of a finding. If you have one lab that can _always_ produce a result, but no one else can, then clearly we do not really understand what is going on and one might not even be willing to call the result scientific.
2. Maybe not better, but certainly more result oriented. Core facilities do exist right now for things like viral vectors and microscopy (often because you do need levels of technical expertise that are simply not affordable in single labs). If there were a way to communicate how to do experiments more formally then the core facilities could expand to cover a much wider array of experiment types. You still have to worry about robustness, but if you have multiple 'core' facilities that can execute any experiment then that issue goes away as well. The hope of course is that individual labs as they exist today (perhaps with an additional computational tint) would be able to actually replicate each other's result, because we will probably end up needing nearly as many 'core' facilities as we have labs right now, simply because the diversity of phenomena that we need to study in biology is so high.
There are already approaches in this direction e.g. such as providing a standardized experimental hardware-software interface with Antha [1]. The complexity of the problems in questions (biological domain, biophysical, biochemical) is daunting - we do not understand many things, "there is plenty of room at the bottom".
How about recording videos of the lab? People trying to reproduce the experiment can just sift through the video. That may be tedious but it's far better than nothing.
Just a little metadata would help: Experiment A, Phase N, Day X
Video and photographic evidence can play a big role and when we have the extra bandwidth to process such a dataset. Right now we barely have time to do the experiments, much less 'watch tape' to see how we did (maybe if scientists were paid like professional sports players...). In an ideal world we would be collecting a much data as we possibly could about the whole state of the universe surrounding the 'controlled' experiment. That said video and photographs are very bad a communicating important parameters in an efficient way. Think about how hard it is to get information out of a youtube video if you need something like a part number. Photos do better, but if you need to copy and paste out of a photo we will need a bit more heavy lifting to translate that into some actionable format (eg ASCII).
I didn't mean record it to use the data in your analysis, but record it to preserve the methods for others. If they have trouble getting part of the experiment to work, they can pull up the video and see how you did it (at least to a degree; I'm not expecting 360 video. You can't possible record in text everything a video could capture.
Thanks for sharing your knowledge and experience in this discussion, by the way. It's what makes HN great.
Ah, yes, things like JOVE [0] are definitely useful but they don't seem to scale to the sheer number of protocols that need to be documented (eg a single JOVE publication is exceedingly expensive). I have also heard from people who have tried to record video of themselves doing a protocol is that it is very hard to make them understandable for someone else. That said if the 'viewer' is highly motivated videos of any quality could be invaluable. Sometimes it is just better to buy the plane tickets and go directly to the lab of the person who can teach you (if they are still around).
> if the 'viewer' is highly motivated videos of any quality could be invaluable
That's what I meant. Just stick some cameras in the ceiling (or wherever is best) and capture what you can. It seems cheap and better than nothing, but I know nothing about biological research.
True, biology is a horror show full of surprises. Many experimental instructions are as reliable as astrologic forecasts. I guess that is the price of complexity and the human factor.
I would love to hear more about your work, and the strategies you propose to improve the reproducibility of scientific experiments. My email can be found in my user description.
Have you looked at Common Workflow Language (CWL), see below?
Friend of mine has been experimenting with wrapping it all up in Docker containers! :-)
> a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.
I was chatting with a colleague last summer, and asking him for more details about a method he had published that I was having trouble reproducing. After pointing me to a few papers that didn't include the relevant details, hefinally told me that "if we told everyone about how it was done, anyone could do it."
While at the time I was pretty upset with him, perhaps it's the competitive nature of science and funding that also gives people a mild incentive to be secretive.
In this case, simply publishing code would have resolved the questions.
> I was chatting with a colleague last summer, and asking
> him for more details about a method he had published that
> I was having trouble reproducing. After pointing me to a
> few papers that didn't include the relevant details,
> hefinally told me that "if we told everyone about how it
> was done, anyone could do it."
That's not science, that's bullshit. Can you please expose that? Scientists shouldn't simply get away with such malicious behavior.
That is exactly how science works. Methods sections include just enough information to get a good sense of how the experiment was done, not enough information to replicate it exactly. Some disciplines are worse than others.
People could also publish plain text data as supplementary material, but why do that when you can get away with a raster image of a plot...
That's not how science works. During my scientific career I gave code and data away to any and all who wanted them. That includes critics, competitors, and anyone else who was curious.
I felt my results were solid enough (and that my skill at producing more results was good enough) that this wouldn't hurt me. This was just how things worked in my field (physics).
Interestingly, my experimental colleagues rarely felt that they couldn't reproduce results they saw in a paper.
The way you get funded is by being able to do things that others can't do, or by being better at some technique than everyone else. Giving away all your hard won tricks by putting all of them in a methods section takes away your advantage come funding time.
The problem is that it is far easier and cheaper to produce stuff that looks like science but isn't so if you fund stuff without being able to prove it/reproduce it then you will probably end up funding 90% bullshit while starving real science for money.
Any real scientists ought to recognize that secrecy as a strategy is terrible for science as a whole even if it is very temporarily good for them.
Though it should be noted that data sharing and dissemination plans are now required for many funding types, and I've been on at least two applications recently with very strict "You will share with others" requirements in them.
What's the point of funding something that won't help anyone because the author won't tell you how they did it? How is any conclusion that they've come to useful if nobody can verify if it's correct?
This can definitely be the problem. I've seen studies where 50% of funding came from university and 50% from a private company. University required study to be published in their database, company wants the end result and don't want media to know what they are working with. So the report is obfuscated just enough to be legible for publication without giving away too much data from the company.
Especially for final masters degree projects this is very common as the students don't get paid by the university at all, so many try to find a company to sponsor. But the students still need the uni to publish the report for them to get their final degree so you get this conflict of interest again. Most of these reports are just written with the end goal of getting a degree, not of creating solid research, this really needs the stricter universities not letting through all that crap, for now they shouldn't really be trusted the same way as proper research papers.
It is more subtle than you think. It is not that you give no information about how to do things, you lay out all the steps that you took in your methods section. An example of keeping all tricks to yourself is that you do not tell others about the 100 small thing you found out, the hard way, that you should avoid doing. i.e. You explicitly say in the methods section these are the steps I took, but what others really need to know is why you ended up doing all the tiny tiny things it the particular way that you did. In many cases, you could write pages and pages about all these reasons. All these little tricks add up to much greater efficiency. Good experimentalists are the ones that have already made all the mistakes.
Edit: This is additional context for the commenters below.
I remember my teachers in high school making such a big deal about the scientific method on how important it was, how experiments must be reproducible to be useful, but today you barely hear it mentioned one way or another.
It's no more a "comforting lie" than your driving instructor telling you to check your blind spot. Not everyone does it, and bad things happen as a result, but all the more important to teach it.
I'm actually surprised to hear of high school teaching good scientific practice. I don't remember ever being taught that. Widely may it spread.
Hold on -- there's a category confusion here: "check your blindspots" isn't a comforting lie; it's a command. Converting it to "checking your blindspots will avoid collisions with careless drivers" would make it no longer a lie.
I was pointing out the category confusion. Science teachers don't purport to tell you what scientists actually do (history or sociology teachers might, I suppose, but they don't usually "make such a big deal" out of the scientific method). They purport to teach science.
As you say - it's not a comforting lie, it's a command.
Lots of insights/technologies you use today were produced by scientists working in such and even more secretive conditions, even with public funding involved (e.g. nobody at the Manhattan project would broadcast their discoveries to the world and same for many other fields, not necessarily war related).
That's an absurd degree of cynicism. I just finished my PhD in chemical engineering, and I did the reverse of this. In fact, I successfully migrated the younger members of my research group to a completely open-source software stack for chemical simulations, so that in principle anyone with a Linux box could reproduce our results. We published all our code and plain text versions of the data.
I'm not saying you're wrong about the incentives - scientists are often incentivized toward secrecy - but I deny that we have to follow such incentives.
It differs depending on the field. Biology has a reputation for being extremely competitive. I've heard of PIs telling their students what they're forbidden to reveal at conferences for fear of getting scooped. As a CS grad student it was a completely different story: I was delighted just to get someone who was willing to listen to me talk.
I agree. I was on the team at Texas A&M that cloned the domestic cat -- (circa 2001) and secrecy was very important (notwithstanding the very real physical threats we had from anti-cloning nutcases.) Even the lab location was secret. The lab itself wasn't hidden, but nobody but us knew what was going on there. It was simply called "Reproductive Sciences." There was some very high incentives to be "first" -- which we were. Unfortunately for me, as an undergraduate research fellow, my name didn't make it into the Nature paper.. but wow what an experience!
I am writing a grant proposal right now. The success rate is ~19%. Personnel (i.e. publication record) of the team is weighted at 40% of the proposal. If you don't have an H-index of >50, there is no incentive to share with other that which gives you an advantage in the science section.
Code/design/engineer are considerably different than say biology, because you can (and should) publish artefacts. The actual code/algorithms (not pseudo code) needs to be shared and archived with the journal.
There has been some work for archiving code with some journals and some allow video uploads as well for segments from the actual experiments.
> Maybe the format needs to change. Perhaps journals should require video, audio commentary or automated note taking for publication.
A 'world view' column in Nature suggested the same things last week [1]; the author described a paper of theirs [2]:
> Yes, visual evidence can be faked, but a few simple safeguards should be enough to prevent that. Take a typical experiment in my field: using a tank of flowing water to expose fish to environmental perturbations and looking for shifts in behaviour. It is trivial to set up a camera, and equally simple to begin each recorded exposure with a note that details, for example, the trial number and treatment history of the organism. (Think of how film directors use clapper boards to keep records of the sequence of numerous takes.) This simple measure would make it much more difficult to fabricate data and ‘assign’ animals to desired treatment groups after the results are known.
I thought some journals were now allowed video archives. Some of the most famous experiments, such as the Stanley Milgram studies, have excellent video documentation (and in the case of Milgram, it's been replicated all around the world .. although no it's not ethical to do so).
Most experiments run for years (literally) and no one is going to record or archive, let alone watch, years of footage to confirm that one paper is legit.
A brief experiment showing the apparatus and the collection of a few data points might be helpful for understanding the paper, but I can't see using it to verify a non-trivial experiment.
For those that study human subjects, releasing video's of the subject is not going to happen any time soon. Participants have rights, and anomyity is an important one.
Blur their faces? Some experiments might depend on seeing faces, but not all. Plus, you would at least have video of everything the experimenters do, if not the results.
The journal I work for just rolled out a new methods format; the big change is requiring all resources to be listed, with their origin (a major problem is that Chemical A or lab rat subspecies B might be very different from one company or another, even if they're theoretically identical.) What's really needed is broader standardization of all practices; requiring video wouldn't solve the issue that different labs might not know what another would think needed to be videotaped. No one is going to document and record the entire course of an experiment; that would literally be months of footage. We'd like to make sure everyone has the same understanding of procedures and techniques, but that requires communication not just through journals, but between scientists and academic institutions.
Additionally, not everything appears on video. We have found major differences between identical studies run in rooms set at different temperatures. Also, many of the procedures I do would essentially require a camera operator to capture all of the movement. I take a cage out of a rack, take a mouse out of the cage, weigh it, dose it, etc.
No, but the burden of videography is a lot higher than the burden of writing a paper. In fact, a paper isn't written for every study, so in order to write a paper, you'd have to take video of every study just in case one of them is used in a paper in the future. It's a much higher burden than people want to believe. Add to that the fact that most animal facilities won't add cameras unless you force them at gunpoint because historically, videos of that sort make for targeting by protest groups.
Or require the author to only communicate to the technicians via the same means that will be published with the paper -- i.e. Make sure anyone reading the paper has the same spec for the experiment as those who performed the original. [1]
That would help prevent such "hidden specs" from entering the experiment.
[1] Note: this implies that authors cannot be part of the experiment or conduct it themselves, because they can't "pass on their identity" in a research paper, as would be necessary to put readers on par with the technicians.
In my job, I work with an outside contract research organization. The way this works is pretty similar to what you're describing, and its a living hell. For starters, it takes 6-10 drafts back and forth to get a protocol to start with, and the resulting document is never shorter than 12 pages. Then, once they start a study, if we get any aberrant data, the question becomes "did they mess that up, or is that data real?". Of course anything we want to be sure about, we run twice, but when you have a panel of 20 compounds, you can't duplicate all of the work, so some compounds get dropped even if the data were not "real". Also, and maybe this would be different with another CRO, there are often very stressful conversations in which they are trying to avoid being blamed (because if they screwed up, we aren't supposed to pay them the full price), but we just really want to know what happened. Lastly, you can tell a lot by being hands-on with a study; there's a lot you can miss if you aren't in the room with the study. Just my 2 cents.
Well, the burden of a policy has to be judged relative to what you're trying to accomplish. If the policy is ensuring that you achieve your ostensible goals, then that burden is justified.
Based on your description, it actually sounds like it's making you do things exactly the way science is supposed to work! You quickly identify issues of "real effect or experimenter error or undocumented protocol?" -- and you prevent any ambiguous case from "infecting" the literature.
Those are the same objectives modern science is currently failing at with it's "publish but never replicate" incentives.
> Lastly, you can tell a lot by being hands-on with a study; there's a lot you can miss if you aren't in the room with the study.
I wasn't saying that you can't be there in the lab and do that kind of experimentation, just that scientists shouldn't represent this kind of ad hoc work as the repeatable part that merits a scientific conclusion. The process should be: if you find something interesting that way, see if you can identify a repeatable, articulable procedure by which others can see the same thing, and publish that.
The proposal has a problem in that it will increase the quality of results but will enormously slow down progress.
E.g., looking from the perspective of a scientist-reader, if someone has spent a few months doing the ad hoc hands on experiments and achieved interesting observations, then I would want them to publish that research now, instead of spending another half a year to make the repeatable procedure or possibly not publishing it ever because they'd rather do something else than make it up to these standards. There is a benefit from getting all the tricks to make the experiment cleaner, but the clean experiment is just means for acquiring knowledge and ideas for further research, not an end goal by itself for a researcher in that area. Outsiders may have different interests though (https://news.ycombinator.com/item?id=13716233 goes into detail) preferring "finalized and finished research" but to the actual community (who in the end is doing everything and evaluating/judging their peers) generally would prefer the work in progress to be published as-is instead of having more thorough results, but later and less of them.
Outsiders can get the repeatable procedures when it's literally textbook knowledge, packaged in assignments that can be given to students for lab exercises (since that's the level what truly repeatable procedures would require). Insiders want the bleeding edge results now, so they design their journals and conferences to exchange that.
Then maybe the problem is that the public is expecting results to be actually true, exacerbated by "hey what peer-reviewed literature are you supporting your argument with". If peer-reviewed literature is just a scratchpad for ad hoc ideas that may turn into something legit later, then the current standard is good enough, and we shouldn't be worrying that most of them are false.
OTOH, it's a problem if people are basing real-world decisions on stuff that hasn't reached textbook level certainty. That's pretty much what happened with dietary advice and sugars. "Two scratchpads say a high-carb low-fat diet is good? Okay, then plaster it all over the public schools."
I think this is the case, and it has been mentioned elsewhere in this thread. When I see a paper published that I am interested in, I have to fit it into the context what I already know about a field, the standards of the the journal its published in, sometimes the origin of the paper (some labs are much less sloppy than others), and other factors.
For a recent personal example, a company published a paper saying that if you give a pre-treat with 2 doses of a particular drug, you can avoid some genetic markers of inflammation that are in the bloodstream and kidneys. Well, I looked at the stimulus and ordered some of my own from a different manufacturer that was easier to obtain and gave it to some mice with and without pretreatment by their compound. Instead of looking at the genes they looked at, I looked at an uptick in a protein expected to be one step removed from the genes they showed a change in. Well I haven't exactly replicated their study, but I've replicated the core points: stimulus with a the same cytokine gives a response in a particular pathway and it either is or isn't mitigated by the drug or class of drugs they showed. Now, my study took 2 days less than theirs, but it worked well enough that I don't need to fret the particular details I did differently from them. If my study didn't work, I could either decide that the study isn't important to me if it didn't work my way or go back a step and try to match their exact reagents and methods.
So yes, I do think the news industry picks up stuff too quickly sometimes, but depending on the outlet, they tend to couch things in appropriate wiggle words (may show, might prove, could lead to, add evidence, etc).
Yes, the problem seems to be that the general public is expecting journals/conferences that are essentially implemented by a research community as a tool for their ongoing research workflow, to fit to the goals of informing the general public - but there reasonably would/should/must be a gap of something like a year (or many years) between the finding must be initially published so that others can work on that research, and the time when the finding is reasonably verified by other teams (which necessarily happens a significant time after it's been published) and thus is ready to be used for informing public policy.
It's like the stages of clinical research - there we have general standards on when it is considered acceptable to use findings for actually treating people (e.g. phase 3 studies), but it's obviously clear that we need to publish and discuss the initial findings since that's required to actually get to the phase 3 studies. However, the effects seen in phase 1 studies often won't generalize to something that actually works in clinical practice, so if general public reads them then often they'll assume predictions that won't come true.
Usually, if you want the full details you look/ ask for the thesis or full report that the journal article is the condensed version of. Most journals also allow supplementary sections for more detailed methods. If its too long, people simply won't read it - already most will just read the abstract and look at figures.
The 'unknown unknowns' will never be eliminated. Certainly it's worthwhile to try to improve methods communication, but there are limits.
You really have to try it yourself before you can understand the degree of troubleshooting that's required of a good experimentalist. You could have scientists live-streaming all their work, and you'd still have the same issues you do now. Even a simple experiment, something most wouldn't blink an eye at, has dozens and dozens of variables that could influence the result. The combinatoric complexity is staggering. The reality is that you try things, find some that work, and then convince yourself that it's a real result with some further work.
The methods that endure are the ones that replicate well and work robustly. Molecular biology is still built on Sanger sequencing, electrophoresis, and blotting, all in the context of good crossing and genetics, because that's what works. Some of the genomic tools are starting to get there, I'd venture to say that RNAseq is reasonably standardized and robust at this point. Interpreting genomic data is another story...
What you're talking about is the authors not reporting variables they considered, and then didn't control for. These not being reported is not universal - for example, I very frequently report every variable considered, and make my code available, with comments about variable selection within.
What the parent post is talking about is "unmeasured confounding", or a Rumsfeldian "Unknown Unknown". If there is something that matters for your estimate, but you're unaware it exists, by definition you can neither report it nor control for it.
Yeah I did an experiment with metabolism in relation to temperature with rodents in college. Our results were just absolute garbage for the most part. I think two of us owned cats, walk in temperature controlled fridge was incredibly loud and the light never turned off, and the fridge was on the opposite side of the building from the only lab with the correct equipment to run the experiment. Plus the enclosure for measuring VO2 was not very relaxing for the mice regardless.
Another group was doing a metabolism experiment with caffeine and rats. The only meaningful result they got was the half-life of caffeine. The rats were incredibly animated regardless of whether they had been dosed with caffeine or not, and they basically got garbage for results as well.
I'm an analyst now, and I've noticed biology is a second-class science in the eyes of a lot of hiring managers when it comes to analytics. Not everyone knows how to use data in bio, but if you are a data-type person, you get so much experience working with the worst data imaginable. The pure math types aren't that great at experimental design, and the physics and chem people tend to be able to control most of their variables pretty easily.
I'll admit that bio people tend to be a bit weaker in math, but almost every real-world analysis situation I've been in has been pretty straightforward mathematically. Most of the time is spent getting the data to the point where it can be used.
There's some neat software now (not sure if it was around when you were in college) to translate video into reliable data about the movement of animals. I've used it for fish, but the same software can be used to track people around a room or to track mice in a cage. I think that might be much more useful than, say, how many backflips a mouse does in their cage.
This was about 9 years ago, so it probably existed in a certain sense, but our lab computers were still apple II's because that was what the software for the VO2 sensors was written for. The lab group doing the caffeine experiment was trying to measure whether the resting VO2 went up when dosed with caffeine, so the fact that the rats were moving at all was the primary problem.
We tried using a machine vision program written by a grad student at another college to count trees from old survey photos at one point, and it did not work well at all in anything with more trees than a park-like setting. We ended up just having a human circle all the trees they saw and I wrote a program to detect the circles. It worked much better. The program I created was based on something similar used to count bacteria colonies on petri dishes.
You can add to this that if you are in a facility with both rats and mice, you should never house them in the same room, and you should probably not enter a mouse room after being in a rat room. In the wild, rats are predators of mice.
But in the wild, there aren't separate rooms for predators and prey, either. If laboratory rodent experiments are so reliant on such synthetic conditions, what's the point?
The point is to isolate variables. If you want to see whether X might be causing Y, you can't just do X in random environment while A, B, C, D and Q is happening, and claim if you observe Y that was because of X. You have to isolate X and establish a link between environment with X producing Y and the same (to the extent possible) environment without X not producing Y. Then you can make a step (still not enough to be sure but at least to start suspecting) that X is causing Y. Of course, it could turn out that X is causing Y only when A is present but B is absent, and then your experiment will be a failure. Or maybe Y just randomly happens and you had back luck to land on it exactly when you did X. Nobody said it's always easy :)
If some mice are housed near rats and others are not (or simply vary in how far they are from the nearest rat), that will introduce variation into your measurements. For experiments not specifically about mouse/rat interactions, this variation is irrelevant noise that makes it harder to detect or characterize the effects the experimenters actually care about.
That's not my point, though. Like, if you conduct physics experiments in a vacuum because interactions between objects and atmosphere isn't part of what you're studying, that's fine. If everyone in physics is conducting experiments that way, then suddenly you're left with the question of, it turns out that in real life I encounter atmosphere all the time, and how much can these experimental results tell me about the world I actually interact with? And if it turns out that all the preconditions to physics experiments can't be published because there are too many of them to list, and you're just expected to know these things, doesn't that throw the credibility of the whole enterprise into doubt? Controlling for variables is fine. But if you aren't comprehensively listing all the variables you're controlling for, if the very idea of doing so is considered a fool's errand, then are you even really doing science?
> "if you aren't comprehensively listing all the variables you're controlling for [...] then are you even really doing science?"
Mate I think that's the point of this whole thread that you're commenting in. And the tangential point to the article posted.
Science isn't some binary thing. You can do poor science, and you can do great science. Some variables are hard or impossible to control for. Some fields make this simpler than others. I'd say that as we've continually endeavored with the sciences we're probably better at it now than we've ever been before.
Synthetic conditions are absolutely critical to science. Typically, the more conditions you can specify in the experiment, the more reproducible it should be. Some of these are very difficult, and others in the thread have pointed out that some don't get labeled in the journals.
If we ran such experiments in the wild, completely outside of control, then we can never know what we're really observing. By controlling the environmental variables your observations gain meaning.
Yes, because you are trying to control your variates? Just because physics is the most amenable to experimental control does not mean that the only real science is physics.
I mean by that measure, medicine is not a science either because we don't know most of the possible confounding variables. That doesn't mean that attempting to use the scientific method still isn't the correct choice.
> Just because physics is the most amenable to experimental control does not mean that the only real science is physics.
That's not what I'm trying to say here at all. The point isn't about how amenable you are to experimental control. The point is that even when experimental control is easy to isolate out, like in the physics example above (which I don't think is true in all of physics, by the way), it's not free. You're trying to compensate for the lack of available statistical power to measure an effect in noisy data by cutting down on the noise in the data. But you're doing it by generating the data in an environment that doesn't exist outside of laboratory conditions. Writing off replication failure as not being a problem because lab conditions are difficult to reproduce misses this; if the findings are difficult to replicate in other conditions, that could indicate that the findings are more narrow in scope than the study suggests. As I pointed out downthread, for example, if all the rodents in an experiment on a drug are on the same diet, all the experiment proves (assuming it's otherwise well run) is that the drug works in combination with this diet. If the drug works independently of diet, then the findings on the drug are generalizable. If it doesn't, though, they aren't. And if you have 60 years of medical research based in part on studies with rodents who eat diets very differently than what rodents eat in the wild, or what people eat, then it raises all sorts of questions about the state of medical research. That doesn't mean that medicine isn't a real science, it just raises questions about how well it tells us what we think it's telling us.
You've gotten a couple of answers, but my answer might rely heavily on my field. I work in early stage pharma discovery research. So our goal isn't to determine the basal level of some cytokine, for example, in normal mice. Our goal is typically to see that cytokine's response to treatment, or stimulus, or stimulus then treatment. In other words, the mouse is a living system one step more complicated than a dish of cells, which is its value to us. Sometimes, once you take a drug to non-human primates or humans, you have to drastically change the type of study you run to determine efficacy.
So why aren't mouse studies done in acoustically isolated rooms? Too expensive or lack of forethought? I know metrological laboratories that are mechanically insulated with all sorts of contraptions making the space vibration free so it'd not as if buildings could not be adapted for their intended scientific use as a matter of principle.
In my experience, you're lucky if you can get space for your equipment at all.
One week, they're moving your growth cambers out into the hallway to work in the ceiling. The next, they had to cut power for 8 hours for maintenance, and by the way, it wasn't plugged into an outlet with emergency power. Oops.
Hell, they can't even keep the lab temperature steady. Solutions sitting on your bench will start to precipitate out.
So yeah, the issue is money. It's also planning; you never know what the needs of researchers are going to be in a few years.
In the end, nice facilities can certainly help with a lot, but they don't address the core issues of experimental variables and combinatoric complexity. The way you deal with this is skeptical peers that understand the methods, reliance on robust methods wherever possible, and independent methods to confirm results. Even with all this replication difficulty, it is quite possible to make compelling conclusions.
The classic ideal of controlling a single variable at a time is nice but largely impossible. Especially in biological sciences but that's true even down in chemistry and physics the systems scientists are trying to control are complex and it's not always possible to modify one variable without affecting others.
Independent experiments and controls. AKA, literally the basis of experimental science.
Let's say you want to know where a protein localizes in a cell, of a given tissue, in both mutant and wild-type organisms.
* Immunolocalization. You develop antibodies to the protein of interest, fix and mount tissue, perfuse it with the antibody, and use a secondary antibody to make it detectable.
* Fluorescent tagging. You make a construct with your protein fused to a fluorescent protein. Usually this involves trying a few different tagging strategies until you find one that expresses well. Then you can make a stable transgenic, or try a transient assay. If it works, then you see some nice glowy confocal images. Be careful, though, as the tagging can affect protein localization.
* Fractionation. In some cases, you can get a rough idea by e.g. extracting nuclei and doing a simple Western blot to see where the protein shows up.
In the real world, you might start with the FP-tag and see that your protein is absent from the nucleus in your mutant. Which would be cool, and interesting in terms of figuring out it's function. If that was presented as the only result, I'd reject the paper and think the authors are terrible investigators. I'd want to see, at least, some nuclei preps that detect the protein in the WT, and don't in the mutant. I'd love to see immunos, too, as FP often does mess up the localization.
You can take it even further and start doing deletions. You take the protein and crop bits out to see what happens. You should see stuff like removing the NLS makes it stop going to the nucleus. That's a good sanity check and a sign your methods are working. You can also try to mess with active sites, protein-protein interaction domains, etc. etc. All within a theoretical model of what you think is going on in the cell.
Ultimately, the difficulty of replication isn't that troublesome. An inability to do so is, but science has never been easy. That's what you sign up for, and that's why you need to read papers critically. You get a sense of distrust for data unless there's really solid evidence of something. And when you find that solid evidence, you get a big smile and warm feeling in nerd heart.
In some cases you use various statistical tools over many trials or in others you measure the effects of the variables in other experiments. Or you can control the important variables and allow those with small or no effect on what you're measuring to vary slightly.
There is continual progress on standardization of methodology, but it takes a while. People have vested interests in using their protocols, and there often valid disagreements about what is actually the best protocol.
Nevertheless, these kinds of standardization documents are important. But only as long as deviations from these standards are not discounted when justified theoretically.
> Think about the epistemological ramifications of this.
This this this this this a million times this. If your experiment with lab rats needs to be done this finely to be replicated, is it really telling you anything about the real world? Because of course, we don't really care about the behavior of laboratory rats, right? Not in proportion to how often we do studies with laboratory rats versus other species of animals. We care because of what it can tell us about biology in general. And if the finding can't even generalize to "laboratory rats whose cages have been cleaned recently," does the study really say what it seems to?
The point, I think, is that you want to control for things that really alter the outcome. It's not that a clean cage means that the end-product no longer works, but you want to compare "didn't have drug" vs "did have drug" not "didn't have drug" vs "did have drug but the fire alarm went off this morning". Unless your effect size is much larger than all the noise, you could easily miss a true result.
Worse, you could get a false result the other way. What if your control group had all been spooked before they were measured a few times? That could make the control group seem worse than they are!
So, let's continue with your example of a drug being given to the rodents. Now, you want to control diet -- you don't want your control group eating differently than your treatment group. This ends up being relatively easy to do with laboratory rodents, they aren't popping down to McDonalds for lunch or lying in their food diaries, like human subjects might. You buy your rodent food from one rodent pellet supplier, and you don't change brands or SKU during the experiment, and voila, control.
And this works fine if the drug you're studying works identically given all rodent diets. But you don't know that it does! You're not controlling for the variable of diet in the sense that you know the effect of the variable of interest across all possible diets. You know how the drug works conditional on one specific rodent diet. And if you're not putting what brand of rodent food you use, and that causes replication failures, that suggests that you don't understand the effect of the drug as well as you thought you did. And if you have an entire field of research that undergoes such replication failures often, it's fair to start wondering how much of what gets called "science" isn't the study of the natural world but the study of the very specific sets of conditions that happen in labs in American and European universities.
Yes but the things they mentioned would be bizarre to control for the other way.
Would you expect the tests to be done, then again but with both rats hearing the person upstairs installing a new microscope, then with a regular fire alarm, etc?
Describing the food given is quite a step away from describing how frequently the fire alarm goes off.
Exactly. The grandparent comment begs even more terrible questions about this whole enterprise of our species we call science than even the original article.
a load of variables that are needed to be controlled in order to get the original experiment to work is kind of a red flag tho. like someone could just keep trying the experiment and then adding another variable they are controlling in order to get the experiment to work. but really they are just running a bunch of experiments and getting lucky at some point. assuming they stop trying to control variables or try and replicate themselves as soon as they get a positive result :)
Yes an no. Of course, in something like an animal model it can be really hard to control everything, and the result you find could just be 'luck'.
On the other hand, figuring out what variables to control is huge part of science. Say the development of next generation DNA sequencing technologies. People tried a ton of different variables, conditions, reagents, flow cells, etc. And failed and failed. But eventually they controlled the right conditions, optimized the right things, and now the process is done in thousands of labs every day as a routine tool. This is a technology development example, but the same could be said of the conditions needed to make stem cells.
It's not a red flag, it's reality. Biology is complicated, and no matter what the synthetic biologists tell you :)
You use independent methods and good controls to deal with this. This is nothing new, it's been the basis of experimental science for decades. If you're allowed to publish without doing this, the field has failed. Molecular biology, in particular, has flourished because of the effectiveness of genetic controls. Mutants are highly reproducible (you just send out seed, cultures, or live specimens), verifiable (sequencing), and combinatoric (crossing).
Wouldn't this allow you to explain away all negative results? Say you get a negative result and you don't like it. Then you start fishing for a cause and found out someone sneezed during the experiment, so you'll delete the null result. Isn't that problematic?
I'm surprised no one mentioned the company Mousera (http://www.vium.com/). Granted, it is not a solution for all of science, but it essentially brings down many animal experiments to scripts, which can be run almost like I might run a docker container and script. The idea is incredible. You can even scale it up by running the script multiple times.
Disclaimer: I did bench and animal work 21yrs ago during undergrad, but havent used this company. So I know the field and how difficult replication is, but i'm not sure what has happened in wet science since 1997.
> I looked into the subsequent history of this research. The subsequent experiment, and the one after that, never referred to Mr. Young. They never used any of his criteria of putting the corridor on sand, or being very careful. They just went right on running rats in the same old way, and paid no attention to the great discoveries of Mr. Young, and his papers are not referred to, because he didn’t discover anything about the rats. In fact, he discovered all the things you have to do to discover something about rats. But not paying attention to experiments like that is a characteristic of Cargo Cult Science.
Even if you do know all that, it means reproducing the experiment is costly. Controlling properly an environment take times and resources, and it's not like it's easy to get those for anything, let alone for something like peer reviewing.
Heck, I've tried to reproduce CS papers with incomplete methodologies. There's always enough that that I know it will work, but they normally don't include all the tricks you need to get it to work easily or efficiently. Stuff like "we used an SMT solver with this important feature" but no mention of which solver or "a key part of this algorithm is factoring the sparse matrix with this non-standard technique that you'll have to implement yourself" but don't tell you the number of tricks they used to make it as fast as you need.
That is frustrating. In my field of cognitive neuroscience, there is often little incentive for a researcher to hide parts of their methods from their 'competitors' since, for example, my memory study is not about reporting but protecting a new technology. Indeed, prestige is often a consequence of others adopting your methods, so researchers are motivated to share scripts, and report the methods fully.
HOWEVER, certain journals can place limits on the length of the methods section, which is a damn shame.
> there is often little incentive for a researcher to hide parts of their methods from their 'competitors' since, for example, my memory study is not about reporting but protecting a new technology
If you are protecting, not reporting information (e.g., about a new technology), why would you be incentivized to share it?
Shouldn't it be the responsibility of the original experimenter to provide the information necessary to repeat the experiment? I see nothing in your list that sounds like a good excuse for not being repeatable.
I disagree. Grammar rules are a very effective checklist for clarity and precision in communication, IME. Does the verb have a subject and object? Otherwise, it's not clear who should do which to what. The instructions say, 'turn it on'; does "it" have a clear antecedent? Otherwise, what are they turning on?
I would guess that the practical benefits are a big reason that they are the 'rules'.
I wish science was that simple. The methods section only contains variables the authors think worth controlling, and in reality you never know, and the authors never know.
Secondly, I wish people say: "I replicated the methods and got a solid negative result" instead of "I can't replicate this experiment". Because most of the time, when you are doing an experiment you never done, you just fuck it up.
Here is an example: we are studying memory using mice. Mice don't remember that well if they are anxious. Here are variables we have to take care of to keep the mice happy, but they are never going to go to the methods section:
Make sure the animal facility haven't cleaned their cages.
But make sure the cage is otherwise relative clean.
Make sure they don't fight each other.
Make sure the (usually false) fire alarm hasn't sound for 24 hours.
Make sure the guy who was installing microscope upstairs has finished producing noise.
Make sure there is no irrelevant people talking/laughing loudly outside the behaviour space.
Make sure the finicky equipment works.
Make sure the animals love you.
The list can go on.
Because if one of this happens, the animals are anxious, then they don't remember, and you got a negative result which have nothing to do with your experiment (although you may not notice this). That's why if your lab just start to do something you haven't done for years, you fail. And replicating other people's experiment is hard.
A positive control would help to make sure your negative result is real, but for some experiments a good positive control can be a luxury.