Whilst the experimental subjects and data collection are inherently fraught with difficulty, there's still a LOT of low-hanging fruit regarding things like automation. Many scientists use computers to write up results, to store data and perform calculations, but there's often a lot of manual, undocumented work which could easily be scripted to help those re-running the experiment. For example, running some program to produce a figure, without documenting what options were used; providing a CSV of data, without the formulas used for the aggregate statistics; relying on a human to know that the data for "fig1.png" comes from "out.old-Restored_Backup_2015~"; etc.
Such scripting is a step on the path to formalising methods. They'd help those who just want to see the same results; those who want to perform the same analysis using some different data; those who want to investigate the methods used, looking for errors or maybe doing a survey of the use of statistics in the field; those who want a baseline from which to more thoroughly reproduce the experiment/effect; etc.
The parent's list of mouse-frighteners reminds me of the push for checklists in surgery, to prevent things like equipment being left inside patients. Whilst such lists are too verbose for a methods section (it would suffice to say e.g. "Care was taken to ensure the animals were relaxed."), there's no reason the analysis scripts can't prompt the user for such ad hoc conditions, e.g. "Measurements should be taken from relaxed animals. Did any alarms sound in the previous 24 hours? y/N/?", "Were the enclosures relatively clean? Y/n/?", "Were the enclosures cleaned out in the previous 24 hours? y/N/?", etc. with output messages like "Warning: Your answers indicate that the animals may not have been relaxed during measurements. If the following results aren't satisfactory, consider ..." or "Based on your answers, the animals appear to be in a relaxed state. If you discover this was not the case, we would appreciate if you update the file 'checklist.json' and send your changes to 'experimentABC@some-curator.org'. More detailed instructions can be found in the file 'CONTRIBUTING.txt'"
I like the idea, but who the hell is ever going to go through all of that? Yes, you made some checklist, great. But no other lab is going to go through all of that. And in your field, if you are very lucky, you may have just 1 other lab doing anything like what you are doing. It would be a checklist just for yourself/lab, so why bother recording any of it? Yes, do it, fine, but how long should you store those records that will never be seen, even by yourself? Why in god's name would you waste those hours/days just going over recordings of you watching a mouse/cell/thingy to make sure of some uncountable number of little things did/did not happen? If you need that level of detail, then you designed your experiment wrong and the results are just going to swamped in noise anyway. You are trying, then, to fish out significant results from your data, the exact wrong way to run an experiment. Just design a better trial, there is no need to generate even more confusing data that has a 1/20 chance of being significant.
The checklist is not required to be on such level of detail. It just has to exist and it has to be generic enough. It's interesting to see here example with fire alarm: to me existence of such factors is the smoking gun of potential improvements to the experimental environment. Why not excluding ALL stress factors by designing something like sound-proof cage? Needs extra budget? Probably, but how about some another unaccounted noise that will ruin the experiment? This gives us an idea of better checklist: ensure that experiment provides stressless environment by eliminating sound, vibration, smells etc.
> who the hell is ever going to go through all of that?
It's not particularly onerous, considering the sorts of things many scientists already go through, e.g. regarding contamination, safety, reducing error, etc.
> Yes, you made some checklist, great. But no other lab is going to go through all of that. And in your field, if you are very lucky, you may have just 1 other lab doing anything like what you are doing. It would be a checklist just for yourself/lab, so why bother recording any of it?
Why bother writing any methods section? Why bother writing in lab books? I wasn't suggesting "do all of these things"; rather "these are factors which could influence the result; try controlling them if possible".
> Yes, do it, fine, but how long should you store those records that will never be seen, even by yourself?
They would be part of the published scientific record, with a DOI cited by the subsequent papers; presumably stored in the same archive as the data, and hence subject to the same storage practices. That's assuming your data is already being published to repositories for long-term archive; if not, that's a more glaring problem to fix first, not least because some funding agencies are starting to require it.
> Why in god's name would you waste those hours/days just going over recordings of you watching a mouse/cell/thingy to make sure of some uncountable number of little things did/did not happen?
I don't know what you mean by this. A checklist is something to follow as you're performing the steps. If it's being filled in afterwards, there should be a "don't know" option (which I indicated with "?") for when the answers aren't to hand.
I imagine it would be easy to have a git-like storage system for this information, where reproduction experiments would be a branch without the actual measurement data.
> a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.
While this is interesting to speculate about, perhaps it would be best to start with something like the machine learning literature, where everything is already run computationally, and those in the field have the skills to easily scratch their own itch to improve the system so that it works for them.
Even in machine learning, how difficult would it be to get that field to adopt a unified experiment running system? It sounds like a huge engineering project that would have to adapt to all sorts of computational systems. All sorts of batch systems, all sorts of hadoop or hadoop like systems. And that's going to be far easier than handling wet lab stuff.
I think that the lack of something like this in ML shows that there's enough overhead that it would impede day-to-day working conditions. Or maybe it just hasn't been invented yet in the right form. There are loads and loads of workflow systems for batch computation, but I've never encountered one that I like.
In genomics, one of the more popular tools for that is called Galaxy. But even here, I would argue that the ML community is much better situated to develop and enforce use of such a system than genomics.
I agree that computational fields are more well-suited to spearhead such approaches, but I don't think machine learning is a good example. ML researchers are constantly pushing at the frontiers of what our current technology can do; consider that a big factor in neural networks coming back into fashion was the ability to throw GPUs at them. The choice of hardware can make a huge difference in outcomes, and some researchers are even using their own hardware (the work being done on half-precision floats comes to mind); any slight overhead will get amplified due to the massive amount of work to be computed; and so on.
Maybe a field that's less dependent on resources would be a better fit. An example I'm familiar with is work on programming languages: typechecking a new logic on some tricky examples is something that should work on basically any machine; bechmarking a compiler optimisation may be trickier to reproduce in a portable way, but as long as it's spitting out comparison charts it doesn't really matter if the speedups differ across different hardware architectures.
When the use of computers is purely an administrative thing, e.g. filling out spreadsheets, drawing figures and rendering LaTeX (e.g. for some medical study), there's no compelling reason to avoid scripting the whole thing and keeping it in git.
Such scripting is a step on the path to formalising methods. They'd help those who just want to see the same results; those who want to perform the same analysis using some different data; those who want to investigate the methods used, looking for errors or maybe doing a survey of the use of statistics in the field; those who want a baseline from which to more thoroughly reproduce the experiment/effect; etc.
The parent's list of mouse-frighteners reminds me of the push for checklists in surgery, to prevent things like equipment being left inside patients. Whilst such lists are too verbose for a methods section (it would suffice to say e.g. "Care was taken to ensure the animals were relaxed."), there's no reason the analysis scripts can't prompt the user for such ad hoc conditions, e.g. "Measurements should be taken from relaxed animals. Did any alarms sound in the previous 24 hours? y/N/?", "Were the enclosures relatively clean? Y/n/?", "Were the enclosures cleaned out in the previous 24 hours? y/N/?", etc. with output messages like "Warning: Your answers indicate that the animals may not have been relaxed during measurements. If the following results aren't satisfactory, consider ..." or "Based on your answers, the animals appear to be in a relaxed state. If you discover this was not the case, we would appreciate if you update the file 'checklist.json' and send your changes to 'experimentABC@some-curator.org'. More detailed instructions can be found in the file 'CONTRIBUTING.txt'"