My PhD work is trying to address this by developing better ways for scientists t...

petra · on Feb 23, 2017

Wouldn't it be hard to achieve deep consistency between experiments, in so many labs around the world, with so different conditions/cultures/etc , and when the experimenters aren't experts in consistency , but in their science ?

Wouldn't it be better to use something like a cloud biology model - where you define experiments via code, CRO's compete on consistency(and efficiency and automation) and since they probably do a much larger volume of experiments than the regular lab, they would have stronger incentives to develop better processes and technologies ?

escot · on Feb 24, 2017

I work at a cloud bio lab. We run all of our experiments on automation, and all protocols must be defined in code. The latter is both the power and the difficulty -- when your protocol is defined in code it is explicit. However, writing code is both new and sometimes difficult for the scientists that we currently work with (molecular biology, drug discovery). I believe what we are doing is the right model. But it comes with this overhead of transitioning assays to code, so there is that against it. This is mostly just a matter of time though. Another nice thing about code is that you can't tweak it once it's running. You can define your execution and analysis up front to guard against playing with results down the road. Now that being said, there still needs to be a significant change to how research is funded and viewed by the public because pure tech solutions can't solve everything. Our tech can't decide what you pick yo research. It can't dish out grants to the truly important research. So it will take many angles to really solve any portion of this problem.

petra · on Feb 24, 2017

I do agree, it seems like right model, and will have a large impact.

Between automating labor, economies of scale in purchasing, and access to more efficient technology(like acoustic liquid handling) ,etc - isn't it just a matter of time before cloud biology becomes quite cost effective and combined with other benefits - it would be the only way that makes sense to do research, so funding will naturally go there?

Also - do you see a way to add the extreme versatility of the biology lab into a cloud service ?

escot · on Feb 24, 2017

> it would be the only way that makes sense to do research There will certainly be more than just one way, although I hope cloud labs are the front runner. Also cloud and automated are two separate concepts. We do both, but there's no reason that you can't just do one or the other. The automation is critical for reproducibility for many reasons. But I think the cloud aspect is mostly helpful from a business perspective -- it makes it easier on everyone to get up and running on our system. But there are many in lab automation solutions that are helping fight the reproducibility crisis. And on the flip side, there are cloud labs that aren't automated.

> do you see a way to add the extreme versatility of the biology lab into a cloud service We let you run any assay that can be executed on the set of the devices that we have in our automated lab. So in that sense, yes its very flexible. Also, there's no need to run your entire workflow in the cloud. You can do some at home, some in the cloud. Some people even string together multiple cloud services into a workflow. See https://www.youtube.com/watch?v=bIQ-fi3KoDg&t=1682s

That being said, biology labs can be crazy places. Part of what we do is put constraints on what can be encoded in each protocol to reduce the number of hidden variables. Every parameter that counts must be encoded in the protocol, because once you hit "go" on the protocol, it could run possibly on any number of different devices each time it runs. The only constant is that the exact instructions specified in the protocol will be run on the correct device set.

hyperion2010 · on Feb 23, 2017

1. Yes, but the idea would be that if you provide a way to communicate the variables that actually matter for consistency then you can increase the robustness of a finding. If you have one lab that can _always_ produce a result, but no one else can, then clearly we do not really understand what is going on and one might not even be willing to call the result scientific.

2. Maybe not better, but certainly more result oriented. Core facilities do exist right now for things like viral vectors and microscopy (often because you do need levels of technical expertise that are simply not affordable in single labs). If there were a way to communicate how to do experiments more formally then the core facilities could expand to cover a much wider array of experiment types. You still have to worry about robustness, but if you have multiple 'core' facilities that can execute any experiment then that issue goes away as well. The hope of course is that individual labs as they exist today (perhaps with an additional computational tint) would be able to actually replicate each other's result, because we will probably end up needing nearly as many 'core' facilities as we have labs right now, simply because the diversity of phenomena that we need to study in biology is so high.

wuschel · on Feb 23, 2017

There are already approaches in this direction e.g. such as providing a standardized experimental hardware-software interface with Antha [1]. The complexity of the problems in questions (biological domain, biophysical, biochemical) is daunting - we do not understand many things, "there is plenty of room at the bottom".

[1] https://www.antha-lang.org/

hackuser · on Feb 24, 2017

How about recording videos of the lab? People trying to reproduce the experiment can just sift through the video. That may be tedious but it's far better than nothing.

Just a little metadata would help: Experiment A, Phase N, Day X

hyperion2010 · on Feb 24, 2017

Video and photographic evidence can play a big role and when we have the extra bandwidth to process such a dataset. Right now we barely have time to do the experiments, much less 'watch tape' to see how we did (maybe if scientists were paid like professional sports players...). In an ideal world we would be collecting a much data as we possibly could about the whole state of the universe surrounding the 'controlled' experiment. That said video and photographs are very bad a communicating important parameters in an efficient way. Think about how hard it is to get information out of a youtube video if you need something like a part number. Photos do better, but if you need to copy and paste out of a photo we will need a bit more heavy lifting to translate that into some actionable format (eg ASCII).

hackuser · on Feb 24, 2017

I didn't mean record it to use the data in your analysis, but record it to preserve the methods for others. If they have trouble getting part of the experiment to work, they can pull up the video and see how you did it (at least to a degree; I'm not expecting 360 video. You can't possible record in text everything a video could capture.

Thanks for sharing your knowledge and experience in this discussion, by the way. It's what makes HN great.

hyperion2010 · on Feb 24, 2017

Ah, yes, things like JOVE [0] are definitely useful but they don't seem to scale to the sheer number of protocols that need to be documented (eg a single JOVE publication is exceedingly expensive). I have also heard from people who have tried to record video of themselves doing a protocol is that it is very hard to make them understandable for someone else. That said if the 'viewer' is highly motivated videos of any quality could be invaluable. Sometimes it is just better to buy the plane tickets and go directly to the lab of the person who can teach you (if they are still around).

0. https://www.jove.com/

hackuser · on Feb 24, 2017

> if the 'viewer' is highly motivated videos of any quality could be invaluable

That's what I meant. Just stick some cameras in the ceiling (or wherever is best) and capture what you can. It seems cheap and better than nothing, but I know nothing about biological research.

wuschel · on Feb 23, 2017

True, biology is a horror show full of surprises. Many experimental instructions are as reliable as astrologic forecasts. I guess that is the price of complexity and the human factor.

I would love to hear more about your work, and the strategies you propose to improve the reproducibility of scientific experiments. My email can be found in my user description.

Cheers!

FabHK · on Feb 25, 2017

Have you looked at Common Workflow Language (CWL), see below?

Friend of mine has been experimenting with wrapping it all up in Docker containers! :-)

> a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.

http://www.commonwl.org