Even if you try to account for the overall R&D cost, DeepMind isn't *that* large...

dekhn · on Nov 30, 2020

DeepMind is taking advantage of NIH's funding. For example, Anfinsen who demonstrated that proteins fold spontaneously and reproducibly (https://en.wikipedia.org/wiki/Anfinsen%27s_dogma) ran a lab at NIH. Levinthal (who postulated an early and easily refutable model of protein folding) was funded by NIH for decades. Most of the competitors at CASP are supported by NIH and its investments have contributed to the modern results significantly.

That said I think the academic and pharma communities had engineered themselves into a corner and weren't going to see huge gains (even thogh they are exploring similar ideas) for a number of banal reasons.

mjn · on Nov 30, 2020

That's a good point; this system certainly didn't come from nowhere! The protein datasets they used also mostly came out of various NIH-funded projects.

What I meant to focus on was that I think DeepMind has less of a pure money/scale advantage in this area than in some others. In something like Go or Atari game-playing, there are many academic groups researching similar things, but their resources are laughably small compared to what DeepMind threw at it. So you might argue that they got good results there in part because they directed 1000x the personnel and compute at the problem compared to what any academic group could afford. In biomed though, their peers in academia and industry are also pretty well-funded.

dekhn · on Nov 30, 2020

Personally I think a major part of the secret sauce is Google's internal compute infrastructure. When I was an academic, 50% of my time went to building infra to do my science. At Google, petabytes of storage, millions of cores, algorithms, and brains were all easily tappable within a common software repo and cluster infrastructure. That immediately translates to higher scientific productivity.

MaxBarraclough · on Nov 30, 2020

Has cloud computing changed this?

dekhn · on Nov 30, 2020

Mostly? I left google to work at a biotech startup working in a related area and found that the big three cloud providers have built systems that greatly improve computational science. That said, it's still a lot of work to get productive, many in the field are really resistant to changes like version control, continuous integration, testing, and architecting distributed systems for handling complex lab production environments.

Here's an exemplar of how I think it evolved well in a cloud world: https://gnomad.broadinstitute.org/

that project adopts many concepts from google and others and greatly improved our analytic capabilities for large-scale genomics.

asah · on Nov 30, 2020

Having recently experienced both, 1000x this.

t_serpico · on Nov 30, 2020

You hit the nail on the head here.

WanderPanda · on Nov 30, 2020

It seems like spending these government funds on creating new challenges like CASP and ImageNet could have an enormous ROI. Don’t let them try to choose the winner, just let them define the game