Hacker News new | past | comments | ask | show | jobs | submit login
Simulation of a 2B-atom cell that metabolizes and grows like a living cell (nvidia.com)
286 points by jonbaer on Jan 28, 2022 | hide | past | favorite | 207 comments



Is this simulation on the atomic level with full all interatomic physics processes simulation or there were made some simplifications?

All interatomic interactions are simulated separately for each atom or they made statistical estimations and used some assumptions? Those two are absolutely two different types of simulation.


It doesn't appear to be ab initio simulated (e.g. QED up) if that's what you're asking. They appear to swoop in at higher scales (molecular level) and simulate molecular interactions across "hundreds of molecular species" and "thousands of reactions."

Apparently the interface between molecules uses the Chemical Master Equations (CME) and Reaction-Diffusion Master Equations (RDME) both of which I'm unfamiliar with: http://faculty.scs.illinois.edu/schulten/lm/download/lm23/Us...


Yes, this appears to be the underlying simulation software. Here’s a home page link to the project as well: http://faculty.scs.illinois.edu/schulten/Software2.0.html

“Lattice Microbes is a software package for efficiently sampling trajectories from the chemical and reaction-diffusion master equations (CME/RDME) on high performance computing (HPC) infrastructure using both exact and approximate methods.”


For anyone who is wondering what QED is: Quantum electrodynamics (QED) https://en.wikipedia.org/wiki/Quantum_electrodynamics


Ah, should have realized when Quad Erat Demonstradum made no sense...!


I've always imagined Feynman making a QED, QED joke sometime in the late 1940's for this very reason. Like he could end his Shelter Island Conference talk with the joke and then look up from the blackboard... to a bunch of confused and deeply furrowed brows, lol.


The paper (well, the abstract) calls it "fully dynamical kinetic model".

Or, in other words, it doesn't solve the Schrodinger equation at all, but uses well known solutions for parts of the molecules, and focuses on simulating how the molecules interact with one another using mostly classical dynamics.


I do classical molecular dynamics simulations for a living, and I feel the model using in this paper is pretty dramatically different than what would typically be described as classical dynamics. 2B atoms would be absolutely insane for any sort of simulation that resolves forces between atoms of even groups of atoms, especially in organic systems.

As far as I can tell from their model, molecules don't interact with each other ~at all~ through classical dynamics. Rather, they define concentrations of various molecules on a voxel grid, assign diffusion coffecients for molecules and define reaction rates between each pair of molecules. Within each voxel, concentrations are assumed constant and evolve through a stochastic Monte-Carlo type simulation. Diffusion is solved as a system of ODEs.

This is a cool large scale simulation using this method, but this is a far cry from an actual atomic-level simulation of a cell, even using the crude approximations of classical molecular dynamics. IMO it is kind of disingenuous for them to say 2B atoms simulation when atoms don't really exist in their model, but it's a press release so it should be expected.


Excuse formatting, one phone. Was gonna put more refs... But phone.

Yes, this is not the standard "force field" pairwise stuff you're used to when you heard "simulation" of biomolecular systems. I don't know if it's quite disingenuous, just not what we expect based on what the vast, vast majority of the field does! It does represent that many atoms. We shouldn't include atoms for the sake of having them, right? It should depend on what questions we're asking of the system.

I like seeing other (simulation or analytic) methods get attention. Lattice methods --(HP models[0], for hydrophobicity[1], lattice-boltzmann even. field theoretic (see polymer physics, melts, old theory[2], new theory, and even newer simulation[3]). Even the simplest shit like springs[4]!

[0] https://scholar.google.com/citations?view_op=view_citation&h... [1] https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.11... [2]https://www.google.com/books/edition/Renormalization_Group_T... [3]The Equilibrium Theory of Inhomogeneous Polymers (International Series of Monographs on Physics) https://www.amazon.com/dp/0199673799/ref=cm_sw_r_apan_glt_i_... (numerical scft) [4] https://en.m.wikipedia.org/wiki/Gaussian_network_model


> It does represent that many atoms.

It also represents even more electrons and even more quarks than that. I think it would be silly to characterize this system by the number of constituent quarks, but that's just me. To me, the important number is the number of degrees of freedom within the model. In a force-field model this scales linearly in with the number of atoms. In the model presented in the paper, this depends on voxel resolution and number of molecular species. Sadly, this is omitted in the press release.

Thanks for your links :) I work in inorganic materials but I really should understand more about models for more complex systems.


Do you know how they deal with DNA? Does it fit in a single voxel? (Probably no). Are strands of DNA and RNA treated in chunks in a chain of voxels? Does the simulation perform transcription?


I am not an expert on this type of model but after reading their methods section a little more thoroughly, this is my sense:

I don't believe there is any case where a molecule in one voxel knows about molecules in other voxels. DNA and RNA is coarse grained where a 'molecule' might represent a specific sequence of interest rather than a full chain. Transcription and translation are modeled, essentially by saying if the neccessary ingredients are present in a voxel (DNA/RNA sequences, enzymes, raw materials, etc) there is some chance of forming mRNA or a protein as a function of the molecular concentrations present in the voxel. DNA and RNA reactions are treated with somewhat different equations than the rest of the molecules, I think to handle the coarse graining.


This is all correct. These simulations actually can model molecular crowding by using a diffusive propensity proportional to the particles in the adjacent cells but I don’t think it was used here. I developed this methodology in grad school, but it didn’t go any where.


Thanks for confirming my take. This is a little out of my comfort zone.

Do you see a future for this simulation method with increased computing power, or do you think limitations of the method might still limit its applicability. Maybe this is naivete coming from atomistic perspective, but it seems to me that the inability to model reactions that aren't explicitly predefined would be a significant challenge.


Dna does not fit in a voxel. Transcription is modeled by particles fixed in space which produce transcripts at a constant rate. The mRNA is treated the same as the other discrete particles in the simulation, though they are likely a bit bigger than the voxel size.


In theory, I don't think there's such a thing as simulation without simplifications. The world seems to be continuous but our computers are discrete. There's a small set of things we know how to solve exactly with math but in general we have no ways to deal infinity. Any given variable you're calculating will be truncated at 32 or 64 bits when in reality they have an infinite number of digits, changing at continuous timesteps, interacting with every other atom in the universe.

In practice, none of this matters though and we can still get very useful results at the resolution we care about.


All simulations have to make the Born-Oppenheimer approximation, nuclei have to be treated as frozen, otherwise electrons don't have a reference point.

There will never be true knowledge of both a particle's location and momentum a la uncertainty principle, and will always have to be estimated.


But, for a system of two quantum particles which interact according to a central potential, you can express this using two quantum non-interacting particles one of which corresponds to the center of mass of the two, and the other of which corresponds to the relative position, I think?

And, like, there is still uncertainty about the position of the "center of mass" pretend particle, as well as for the position of the "displacement" pretend particle.

(the operators describing these pretend particles can be constructed in terms of the operators describing the actual particles, and visa versa.)

I don't know for sure if this works for many electrons around a nucleus, but I think it is rather likely that it should work as well.

Main thing that seems unclear to me is what the mass of the pretend particles would be in the many electrons case. Oh, also, presumably the different pretend particles would be interacting in this case (though probably just the ones that don't correspond to the center of mass interacting with each-other, not interacting with the one that does represent the center of mass?)

So, I'm not convinced of the "nuclei have to be treated as frozen, otherwise electrons don't have a reference point" claim.


You are right not be convinced, because it is entirely incorrect.


With a quantum computer could one theoretically input the super position of possible locations and momenta and run the simulation based on that?


What? This is simply untrue.


A simulation can have both.


Then is it an accurate simulation without the uncertainty?


The pilot wave theory works perfectly fine with both exact position and momentum, but in other interpretations such particles simply don't exist.


I doubt it makes sense to assume the unverise is continuous (I'm glad you said "seems"). In particular, space could be spatially quantized (say, around the planck length) or any number of other details.

People have done simulations with quad precision (very slow) but very few terms in molecular dynamics would benefit from that. In fact, most variables in MD can be single precision, exceptt for certain terms like the virial.


It’s definitely fun to think about.

If the universe is discrete, how does one voxel communicate to the neighboring voxel what to update without passage through ‘stuff in between’ that doesn’t exist? Heh

It seems physics is going the opposite way with infinite universes and multiple dimensions to smooth out this information transfer problem and make the discrete go away.


As someone with keen interest in physics (and a bit of training) I find speculation about "discrete space" disquieting. It's the level of abstraction where intuition about space breaks down, and you have to be very careful. Remember that coordinate systems are short-hand for measurement. It's one thing to admit fundamental limits on measurement resolution, and quite another to say that space itself is quantized! Mostly I get around this by not thinking about it; most of these theories are only testable in atrocious and unattainable conditions, doing things like performing delicate QED experiments at the edge of a black hole.

I don't think your "voxel" intuition can be right because it's a small jump from that to (re)introducing an absolute reference frame.


> how does one voxel communicate to the neighboring voxel what to update without passage through ‘stuff in between’ that doesn’t exist? Heh

That kind of reminds me of the 'aether' that was once hypothesized as a medium of transmission for light and radio waves [0].

Also, voxel's communicating sounds an awful lot like a higher-dimensioned cellular automata.

[0] - https://en.wikipedia.org/wiki/Aether_theories


Stephen Wolfram was right all along


All of our current theories are set in continuous spacetime. At the present, there's no reason to assume anything else.


I wouldn't say that "all our current theories" are set in continuous spacetime. For example, Quantum chromodynamics is set in SU(3), an 8-dimensional group of rotation-like matrices. Electric charge is discrete, spin is discrete, electron orbitals are discrete. In fact position and momentum would seem to be the outlier if they were not also discrete. I hardly call that "no reason".


SU(3) is a continuous group.


Yeah but it is very much not in space time.


But it is. SU(3) is the group for swapping colors around. It still has spacetime.


You can Cartesian product it with space time, yes. But that is possible for any system.


I'm having a hard time imagining quantum chromodynamics set in a single point. :)


It is based on SU(3), but, does it really make sense to say that it isn't still set in spacetime? Like, quarks still have position operators, yes?


True, but we do not actually know this for sure. There is a (small) possibility that we are simply looking at this at a scale where all we see is macro effects. It would require the quanta to be much smaller than the Planck distance though.


> There is a (small) possibility that we are simply looking at this at a scale where all we see is macro effects. It would require the quanta to be much smaller than the Planck distance though.

How much smaller?


Many orders of magnitude. How many? I do not know, I don't think anybody does.

But photons resulting from the same event but with different energies arrive at detectors an appreciable distance away to all intents and purposes simultaneously, something that would not happen if spacetime were discrete at a level close to the Planck length. So it would have to be quite a big difference for an effect not to show up as a difference in time-of-flight.


Is the idea that the two photons would traverse slightly different voxels due to the lower frequency wave being more spread out?

What accounts for the expectation that they do not arrive simultaneously in a voxel-based universe?


the issue is that there are no theories based on experimental evidence at very small scales. I agree that in most situations, it would be silly to violate this assumption, unless you were working on advanced physics experiments.


There's this concept that causation moves at the speed of light. When I first heard that, it sounded very much like a fixed refresh rate to me. Or maybe the "real world" is just another simulation


It does if you put it that way, but another way of putting is that spacetime is hyperbolic (...well, lorentzian), and all (lightspeed) interactions are zero-ranged in 4D.

As in, photons that leave the surface of the sun always strike those specific points in space-time which are at a zero spacetime interval from said surface. If you take the described geometry seriously, then "spacetime interval" is just the square of the physical distance between the events.

(And any FTL path has a negative spacetime interval. If that's still the square of the distance, then I think we can confidently state that FTL is imaginary.)


>"full all interatomic physics"

It's certainly not that -- that's a hideously difficult algorithm with exponential complexity.

https://en.wikipedia.org/wiki/Full_configuration_interaction


It’s worse than exponential, it’s factorial :)


verlet list is the standard algo used to reduce the complexity in the number of interatomic calculations

https://en.wikipedia.org/wiki/Verlet_list


that's old tech, these days it's usually some sort of PPPM (particle-particle particle-mesh) which parallelizes better.

But that's for classical simulations. Full configuration interaction is effecftively computing the schrodinger equation at unlimited precision, in principle if you could scale it up you could compute any molecular property desired, assuming QM is an accurate model for reality.


p3m, well pme, is exactly what we used for our calculations ;)

i never did any qm work beyond basic parameterization

i'm guessing you are/were also computational physics guy :)


I was a computational biologist for many years, which included a bunch of biophysics. I did extensive work with PME about 20 years ago, on supercomputers. It's a pretty neat technique (https://en.wikipedia.org/wiki/Ewald_summation), once you wrap your head around it!


yup, we used PME for non-bonded calculations in our simulations and to calculate things like electric potentials. I finished a biophysics phd back in 2020 and focused mainly on fluid flow.

Pretty cool, what're you up to now?


helping genentech scientists move to the cloud. I stopped being a scientist a long time ago and now I just sort of help scientists with the stuff I'm already good at.


funny enough I'm doing the exact same thing in public sector education. I'm always curious where people in our field end-up.

i saw some of your other comments about being at google. did you touch jax-md at all?

https://github.com/google/jax-md


I talked to the team, but unfortunately, jax-md at the ttime didn't do bond angles or torsions, so it wasn't good for biomolecular simulations.

My work mostly predated tensorflow and was much more about massive-scale embarassingly parallel computing, and produced some interesting large-scale results from MD and protein folding.

https://www.nature.com/articles/nchem.1821 https://onlinelibrary.wiley.com/doi/full/10.1002/pro.2389


yup, i noticed that when i saw the first commits; in fact i thought it was someone's pet project. however, when i read that first odenet paper, it's clear keeping track of the gradients is extremely useful.

I'm very familiar with the first paper, the second author was on my committee.

so what does a cloud migration at a biotech company mean?

is it sort of a standard orchestrator + warehouse/lakehouse + distributed compute + cicd tools stack?


"is it sort of a standard orchestrator + warehouse/lakehouse + distributed compute + cicd tools stack?"

Ideally, yes, exactly. Except there are 100 orchestrators, 100 small local warehouses, and CI/CD is mostly jenkins.

Some things get forklifted over. I'm not trying to push people to adopt cloud native practices, just move them off physical onprem resources. Even that is a challenge because of data gravity.


Of course not, we can't even simulate how one protein folds.


Small proteins (one to two alpha helices) can now be routinely folded (that is, starting form a fully unfolded state, to getting stick in the minimum around the final structure) using ab initio simulations that last several multiples of the folding time.

Larger proteins (a few alpha helices and beta sheets), the folding process can be studied if you start with structures near the native state.

None of this means to say that we can routinely take any protein and fold it from unfolded state using simulations and expect any sort of accuracy for the final structure.


When you say ab initio calculations, could you cite the level of theory? I think there could be some ambiguity given differences in scope.


When I say ab initio I mean "classical newtonian force field with approximate classical terms derived from QM", AKA something like https://ambermd.org/AmberModels.php

Other people use ab initio very differently (for example, since you said "level of theory" I think you mean basis set). I don't think something like QM levels of theory provide a great deal of value on top of classical (and at a significant computational cost), but I do like 6-31g* as a simple set.

Other people use ab initio very differently. For example, CASP, the protein structure prediction, uses ab initio very loosely to me: "some level of classicial force field, not using any explicit constraints derived from homology or fragment similarity" which typically involves a really simplified or parameterized function (ROSETTA).

Personally I don't think atomistic simulations of cells really provide a lot of extra value for the detail. I would isntead treat cell objects as centroids with mass and "agent properties" ("sticks to this other type of protein for ~1 microsecond"). A single ribosome is a single entity, even if in reality it's made up of 100 proteins and RNAs, and the cell membrane is modelled as a stretchy sheet enclosing an incompressible liquid.


Level of theory as it relates to an-initio QM calculations usually indicates Hartee Fock, MP2 and so on, then the basis set gets specified after.

I also agree that QM doesn't provide much for the cost at this scale, I just wish the term ab initio would be left to QM folks, as everything else is largely just the parameterization you mentioned.


The systemn I work with, AMBER, explains how individual classical terms are derived: https://ambermd.org/tutorials/advanced/tutorial1/section1.ht... which appears te be MP2/6-31g* (sorry, I never delved deeply into the QM parts). Once those terms are derived, along with various approximated charges (classical fields usually just treat any charge as point-centered on the nucleus, which isn't great for stuff like polarizable bonds), everything is purely classical springs and dihedrals and interatomic potentials based on distance.

I am more than happy to use "ab initio" purely for QM, but unfortunately the term is used widely in protein folding and structure prediction. I've talked exdtensively with David Baker and John Moulton to get them to stop, but they won't.


I would not describe AMBER, or anything using a newtonian force field, as ab initio.

In inorganic materials ab initio means you actually solve Schrodinger's equation (though obviously with aggressive simplifications e.g. Hartree-Fock).


Sure. But in the protein structure prediction field, "ab initio" is used to mean "structure was predicted with no homology or other similarity information" even though the force fields incorporate an enormous amount of protein structural knowledge.


I guess it's just wierd to me to see ab initio used to describe a class of methods than in my field it explicitly precludes.

Having developed a few newtonian force fields, calling them "derived from QM" is very, very generous :P


What does https://foldingathome.org/ do then? That's been going on for nearly two decades.


Full list of achievements https://foldingathome.org/category/fah-achievements/?lng=en

This is the only real update of the year: https://foldingathome.org/2022/01/03/2021-in-review-and-happ...

SARS-CoV-2 has intricate mechanisms for initiating infection, immune evasion/suppression and replication that depend on the structure and dynamics of its constituent proteins. Many protein structures have been solved, but far less is known about their relevant conformational changes. To address this challenge, over a million citizen scientists banded together through the Folding@home distributed computing project to create the first exascale computer and simulate 0.1 seconds of the viral proteome. Our adaptive sampling simulations predict dramatic opening of the apo spike complex, far beyond that seen experimentally, explaining and predicting the existence of ‘cryptic’ epitopes. Different spike variants modulate the probabilities of open versus closed structures, balancing receptor binding and immune evasion. We also discover dramatic conformational changes across the proteome, which reveal over 50 ‘cryptic’ pockets that expand targeting options for the design of antivirals. All data and models are freely available online, providing a quantitative structural atlas.


Simulating very expensive to compute protein dynamics. These aren't guaranteed solutions, but it's still useful information.


So, even one protein cannot be simulated as in real world?


Even one atom of a heavier element cannot be simulated in the real, depending on what level of detail you want. Multi-atom simulations usually treat them as little non-quantum balls moving around in a force-field that may have been approximated from quantum mechanics.


Not if you're going off of ab initio theory such as Hartee Fock, MP2, CC, etc. We're talking amounts of matrix multiplication that wouldn't be enough to finish calculating this decade, even if you had parallel access to all top 500 supercomputers, you get bigger than a single protein, it's beyond universal time scales with current implementations.


Every time some computer scientist interviews me and shows off their O(n) knowledge (it's always an o(n) solution to a naive o(n**2) problem!) I mention that in the Real World, engineers routinely do O(n**7) calculations (n==number of basis functions) on tiny systems (up to about 50 atoms, maybe 100 now?) and if they'd like to help it would be nice to have better, faster approximations that are n**2 or better. Unfrotunately, the process of going from computer scientist to expert in QM is entirely nontrivial so most of them do ads ML instead


How on Earth? I can't imagine the difference between the computational power of all top 500 supercomputers is THAT many orders of magnitude far off from the computational power of all the folding@home computational power donated by the general public.


supercomputers are specialized products with fast networking to enable real-time updates between nodes. The total node count is limited by the cost of the interconnect to get near-peak performance. You typically run one very large simulation for a long period of simulation time. folding@home doesn't have the luxury of fast networks, jut lots of CPU/GPU. They run many smaller simulations for shorter times, then collect the results and do stats on them.

I looked at the various approaches and sided with folding@home. At one point I had 1 million fast CPU cores running gromacs.


It's not, foldingathome doesn't do those calculations either but uses a simplified model too.


One single iron atom's electrons - 26 of them - contain more degrees of freedom than atoms in the solar system.


A custom supercomputer dedicated to simulating folding proteins (two-state folders with nontrivial secondary and tertiary structure) from unfolded to correctly folded state using only classical force fields probably could work, and DE Shaw has invested a lot of money in that idea: https://en.wikipedia.org/wiki/Anton_(computer)

but, as I pointed out elsewhere, this would not be particularly helpful as it would use an enormous amount of resources to compute something we could probably approximate with a well-trained ML model.

It also wouldn't address questions like biochemistry, enzymatic reactions, and probably wouldn't be able to probe the energetics of interactions accurately enough to do drug discovery.


Nope. Don't dare ask how they treated water/solvation. Lolz.

Now, the question is, does it matter? When do you EVER need to know the exact atomistic, yet alone electronic, trajectory of a single protein starting from a given position within a cell surrounded by waters in a given configuration?

It doesn't really matter. This is the beauty of noise and averages and -- dear to me-- statistical mechanics. At finite temperatures, AKA most everything we experience as living things, quantum details (or precise classical trajectories for that matter) aren't that important for the vast majority of questions we tend to have about a system.


I think it will be extremely useful to be able to simulate cell at real atomic level with all physics involved. That means that you can introduce changes at atomic level in DNA and see its effects. Introduce experimental medications in cell and see its effects. You can actually study cell processes (many of which are not known yet) in simulation. And many, many more. It will be revolution in medicine and in biology in general!


Any atomic level change in DNA is going to have maybe a 1-2 kcal energy difference. To properly calculate that (by running simulations) would require an economically impractical amount of computer time, and doesn't actually really change any of the hard problems in medicine.

Why am I saying this? Because I thought the same as you and it took me 20+ years to realize MD doesn't affect medicine at all.


then [why is my computer fan whirring?](https://foldingathome.org/)


Absolutely definitely not. It's not even possible to simulate a single protein-molecule interaction to an accuracy such that reaction rates are reproduced at room temperature. Small effects such as the quantum nature of H-motion prevent this from happening with present computational resources.

This research is something like a pixar movie, or one of those blender demos with a lot of balls :P


It seems that the researchers did use NVIDIA GPUs to perform the work, but it's not clear what sets the GPUs apart from others and why this research wouldn't be possible without NVIDIA's GPUs, as the article title and body implies.


Nvidia is pushing vertical integration hard. There are all sorts of libraries from Nvidia which build on top of CUDA, from simple cuBLAS to smart cities, autonomous driving, robotics and 5G.

They also provide acceleration of open source libraries like GROMACS, used for molecular dynamics simulation.


As you do, when you're the market leader. Wouldn't want people to be able to swap their GPUs to AMD. This is a real problem at the moment.

Of course AMD is making their own system that works on AMD and nVidia and Intel GPUs. It'll probably get locked down again if AMD gets market dominance.


Reminds me of how people used to think MS was absolutely evil while Apple was an altruistic company and Steve Jobs was some kind of saint. Turns out Apple just needed to gain some market dominance.


There are two main reasons to take advantage of the Gpu in lattice microbes. It can simulate the stochastic chemical reaction and diffusion dynamics in parallel: one thread per voxel. For instance, an E. coli sized cell would have ~40000 voxels. It’s not quite embarrassing parallel, but close. Second, the simulation is totally memory bound so we can take advantage of fast gpu memory. The decision to use CUDA over OpenCL was made in like 2009 or so. Things have changed a lot since then. I don’t think anyone has the time or interest to port it over, unfortunately.


Most likely because the software they use uses CUDA.


I know that CUDA is faster than OpenCL for many tasks, but is there something that is not possible to achieve in OpenCL but possible in CUDA?


I’m subscribed to some CUDA email list with weekly updates.

One thing that strikes me is how it evolves with new features. Not just higher level libraries, but also more fundamental, low level stuff, such as virtual memory, standard memory models, c++ libraries, new compilers, communication with other GPUs, launching dependent kernels, etc.

At their core, OpenCL and CUDA both enable running parallel computing algorithms on a GPU, but CUDA strikes me as much more advanced in terms of peripheral features.

Every few years, I think about writing a CUDA program (it never actually happens), and investigate how to do things, and it’s interesting how the old ways of doing things has been superseded by better ways.

None of this should be surprising. As I understand it, OpenCL has been put on life support by the industry in general for years now.


If you ever need to reap the benefits of CUDA & GPU computations without getting into the details, check out JAX by our corporate overlords™ (https://github.com/google/jax), it has a NumPy like syntax and super fast to get started


Why would you suggest JAX? CuPy seems like an obvious choice here (simpler and a lot more mature). Jax is only needed if you want automatic differentiation.


TIL CuPy exists and is stable and mature. This is why i read this forum, every now and then there is serendipitous connection, new angle or discovery.


I just want pyXLA, a tool to directly construct XLA programs.


> is there something that is not possible to achieve in OpenCL but possible in CUDA

Developing fast... OpenCL is much harder to learn than CUDA. Take someone who did some programming classes, explain how CUDA works and they'll probably get somewhere. Do the same thing with OpenCL and they'll probably quit.


What makes it harder?


Possibly, but that's not really the point, the article is part marketing push from nvidia for their HPC department.


> but that's not really the point

That's what I thought as well, so the title on the website ("NVIDIA GPUs Enable Simulation of a Living Cell") is not really truthful then.


My understanding is that CUDA has a lot of optimized libraries for common tasks, think BLAS, that don't currently exist in OpenCL/Vulkan Compute.


I'm much more aware of slot of things research is doing with Nvidia.

Due to cuda, tools, SDKs etc Nvidia is providing.

I'm not aware of anything similar at any other GPU company


Can someone comment on the legality of a 3rd party providing an unauthorized implemention of the CUDA API?

I would think that Oracle's loss of a similar lawsuit with Java would be related.


> Can someone comment on the legality of a 3rd party providing an unauthorized implemention of the CUDA API?

NVIDIA said that they would be fine with it in the past, and ROCm HIP is just a (bad) CUDA API clone.


Interesting question. Press article aside, GPGPU applications like scientific compute, ML etc. have all mostly gravitated to Nvidia / CUDA.

Not working in this space, I'm curious why this is the case. Is there something inherently better about CUDA? Or is it that Nvidia's performance is somehow better for these tasks? Or maybe something else?


The products are good, but NVidia also cleverly bootstrapped a whole ecosystem around them.

One of the other posts mentions 2014 as a turning point. At that time, GPGPU stuff was entering the (scientific) mainstream and NVidia was all over academia, convincing people to try it out in their research. They handed out demo accounts on boxes with beefy GPUs, and ran an extremely generous hardware grant proposal. There was tons of (free) training available: online CUDA MOOCs and in-person training sessions. The first-party tools were pretty decent. As a result, people built a lot of stuff using CUDA. Others, wanting to use those programs, basically had to buy NVidia. Lather, rinse, repeat.

This is in stark contrast to the other “accelerator” vendors. Around the same time, I looked into using Intel Xenon Phi and they were way less aggressive: “here are some benchmarks, here’s how it works, your institution has one or two somewhere if you want to contact them and try it out.” As for OpenCL…crickets. I don’t even remember any AMD events, and the very broad standard made it hard to figure out what would work/work well and you might end up needing to port it too!


AMD's GPU OpenCL wasn't just not marketed, it was also a bad product, even for relatively tame scientific purposes, even when AMD made loud, repeated statements to the contrary. Hopefully now that AMD has money they can do better.

I'm sure that NVidia's ecosystem building played a role (I remember events in 2009 and books before that), perhaps even a big role, but it wasn't the only factor. I paid a steep price in 2014 and 2016 for incorrectly assuming that it was.


Back in 2014 or so I made the unfortunate mistake of buying AMD cards with the thought that I'd just use OpenCL. I knew that some codes wouldn't run, but I had catalogued the ones I really cared about and thought I was up for the challenge. I was so, so wrong.

First of all, software that advertised OpenCL or AMD compatibility often worked poorly or not at all in that mode. Adobe creative suite just rendered solid black outputs whenever acceleration was enabled and forums revealed that it had been that way for years and nobody cared to fix it. Blender supported OpenCL for a while, but it was slower than CPU rendering and for a sticky reason (nvidia did the work to support big kernels with heavy branching and AMD didn't). Ironically, OpenCL mode had decent performance but only if you used it on an nvidia card.

The situation was even worse in scientific codes, where "OpenCL version" typically meant "a half-finished blob of code that was abandoned before ever becoming functional, let alone anywhere near feature-parity."

I quickly learned why this was the case: the OpenCL tooling and drivers weren't just a little behind their CUDA counterparts in terms of features, they were almost unusably bad. For instance, the OpenCL drivers didn't do memory (or other context object?) cleanup, so if your program was less than perfect you would be headed for a hard crash every few runs. Debugging never worked -- hard crashes all around. Basic examples didn't compile, documentation was scattered, and at the end of the day, it was also leagues behind CUDA in terms of features.

After months of putting up with this, I finally bit the bullet, sold my AMD card, bought an NVidia card, ate the spread, the shipping, the eBay fees, and the green tax itself. It hurt, but it meant I was able to start shipping code.

I'm a stubborn bastard so I didn't learn my lesson and repeated this process two years later on the next generation of cards. The second time, the lesson stuck.


NVIDIA provides tools that just mostly do not exist for other GPUs, making it easier to build on CUDA instead of something else.


Absolutely this. When cuda was first making headway it was the only thing even remotely close to a "developer environment" and made things significantly easier than any of the alternatives.

It might be different now, but at that time, many of the users were not computer scientists, they were scientists with computers. Having an easier to use programming model and decent debugging tools means publishing more results, more quickly.


From my cursory knowledge of the topic, there are competitors like ROCm, but CUDA was the first that had a useable solution here. Also last time I checked ROCm doesn't have broad support on consumer cards, which makes it harder for people to try it out at home.

But it seems ROCm is getting better and it has tensorflow and pytorch support, so there's reasons to be hopeful to see some competition here.


The fine grain parallelism of this simulation suits the GPU well. It would be possible on multicore CPUs, but possibly slower.


How much emergent behavior arises from the model? The only passage I see describing any of it is this one:

> The model showed that the cell dedicated most of its energy to transporting molecules across the cell membrane, which fits its profile as a parasitic cell.

Whether it mimics the behavior of real cells seems like the right test. We'll never be able to get it to parallel the outcome of a real system, thanks to chaos theory. But if it does lots of things that real cells do -- eating, immune system battles, reproduction -- we should be pretty happy.


How long did it take to simulate 20 minutes?

Looks like one NVIDIA Titan V took 10 hours to do it, and one NVIDIA Tesla Volta V100 GPU took 8 hours to do it?

Am I reading that right?

So the NVIDIA Tesla Volta V100 is 24 times slower than real life? Pretty cool.


Generally speaking, this depends on the size (in terms of the number of constituents) of the piece of "real life" you are simulating.


This is really cool, but I don’t think it’s an atomistic simulation so I’m not sure where the title is coming from.

It seems to be some kind of a (truly impressive) kinetic model

The paper in Cell is open access https://doi.org/10.1016/j.cell.2021.12.025


A two-billion atom cell ... isn't that a bit small for a cell?


Yes. The cell was first created in the real world as part of research about the minimal set of genes required for life.[1] It is known as "JCVI-syn3.0" or "Mycoplasma laboratorium".'[2]

Still amazing that it can now be fully simulated "in silico".

[1] https://www.science.org/doi/10.1126/science.aad6253

[2] https://en.wikipedia.org/wiki/Mycoplasma_laboratorium


It says

"In this new organism, the number of genes can only be pared down to 473, 149 of which have functions that are completely unknown."

But if we now can simulate this cell completely, shouldn't it be easy to figure out what those genes are doing? Just start the simulation with them knocked out.


Presumably if the number of genes cannot be pared down below 473, it dies very quickly if one of the 149 genes is knocked out. But "it doesn't work without it" is not a very satisfactory answer to "what does it do".


Yes, this is similar to opening a radio and saying "I don't know what this transistor does; let's take it out and see what the radio does".


See also "Can a biologist fix a radio" https://www.cell.com/cancer-cell/pdf/S1535-6108(02)00133-2.p... "Doug & Bill"(http://www2.biology.ualberta.ca/locke.hp/dougandbill.htm) "Could a neuroscientist understand a microprocessor"? https://journals.plos.org/ploscompbiol/article?id=10.1371/jo...

The funny thing is if you read the history of Feynman and others, most of them grew up opening up radios and learning how they worked by removing things. fixing them. It's a very common theme (sort of falls off post-transistor tho). I opened up radios as a kid, tried to figure out what parts did what, and eventually gave up.


That is a great read. Thanks :)


Before attempting to crack the copy-protection on a game, one might think something similar.


Valgrind that cell!


The article mentions that they use minimal cells. "Minimal cells are simpler than naturally occurring ones, making them easier to recreate digitally."


Permutation City, here we come.

I wonder if Greg Egan had the foresight to predict this for the story or if he invented that part for narrative purposes.


When I was like 12 or so, I had a thought that if we can calculate everything, we could be living in a full blown simulation.

To be honest, like 30y later, I still go back to that nagging thought _a lot_.


The thought that sticks in my mind is mathematical realism; if we can prove the mathematical existence of the outcome of a simulation (nothing harder than inductively showing that the state of a simulation is well-defined at state S for the first and all successive S) then what's the difference between things in the simulation actually existing v.s. possibly existing? All of the relationships that matter between states of the simulation are already proven to exist if we looked at (calculated) them, so what necessary property can we imagine our Universe having that the possible simulation does not?


> so what necessary property can we imagine our Universe having that the possible simulation does not?

It lacks the magical spark, the qualia, the spirit, the transcendent. Or what people like to imagine makes our own reality special. Our own reality cannot be understood because it's such a hard problem, and it "feels like something" (maybe like a bat?), while a simulation is just math on CPUs. Consciousness is a hard problem because it transcends physical sciences, it's so great that it can exist even outside the realm of verification. /s

Hope you forgive the rant, it's just amazing how much philosophy can come from the desire to fly above the mechanics of life. But what they missed is that the reality of what happens inside of us is even more amazing than their imaginary hard problem and special subjective experience. The should look at the whole system, the whole game, not just the neural correlates. What does it take to exist like us?


A simulated hurricane doesn't kill anyone.

But it may be possible that there's no such thing as "simulating" intelligence. If you do certain calculations, that is "intelligent." Same for consciousness, etc.


A simulated hurricane would kill simulated people.


Think of simulated children! Oh the simulated pain..


"We live inside a dream."



This idea has been formalized: https://www.simulation-argument.com/


This idea has also existed for at least 200 years

https://en.m.wikipedia.org/wiki/Laplace%27s_demon


I read that as a teenager, thought it sounded nice, went to grad school and did molecular dynamics simulations (like folding at home) for a decade, then went to google and built the world's largest simulation system (basically, the largest group of nodes running folding at home). Eventually we shut the system down because it was an inefficient way to predict protein structure and sample folding processes (although I got 3-4 excellent papers from it).

The idea is great, it was a wonderful narrative to run my life for a while, but eventually, the more I learned, the more impractical using full atomistic simulations seem for solving any problem. It seems more likely we can train far more efficient networks that encapsulate all the salient rules of folding in a much smaller space, and use far less CPU time to produce useful results.


Yeah, I think the idea of Laplace's Demon is mostly just useful to make a philosophical argument about whether or not the universe is deterministic, and it's implication on free will.


I dunno, I wonder what Laplace would have made of the argument over the meaning of wavefunction collapse. It took me a very long time to come to terms with the idea of a non-deterministic universe.


That's peculiar. Most people probably struggle more with the idea of a deterministic universe, as it'd leave no room for free will, which would make everything kind of meaningless.

I'm also more in the camp of "quantum effects making the universe non-determinstic." It's a nicer way to live.


I've evolved over the years from "determinism implies no free will" to roughly being a compatibilist (https://en.wikipedia.org/wiki/Compatibilism, see also Daniel Dennett). I don't particularly spend much time thinking that (for example) a nondeterministic universe is required for free will. I do think from an objective sense the universe is "meaningless", but that as humans with agency we can make our own meaning.

However, most importantly, we simply have no experimental data around any of this for me to decide. Instead I enjoy my subjective life with apparent free will, regardless of how the machinery of the actual implementation works.


It’s interesting that many things are deterministic to human-relevant time/length scales. If the small stuff is non-deterministic, it’s interesting that large ensembles of them are quite deterministic.

It’s maddening :)


https://en.wikipedia.org/wiki/Evil_demon

Going further back to the 1600's, Descartes' idea of an evil demon deceiving one's mind with a perfect, fake reality made me think often of simulations in my undergrad philosophy classes


Looking at it from that view, we're just as likely to be a simulation as we are to have been created by God. I mean I'm a theist, but I don't see many huge differences except the cultural aspect where the theism/atheism debate is something most people have an emotional connection to.


A God, not being out for her own amusement, will likely create only one universe.

A player with a simulator will create dozens.


>A God, not being out for her own amusement, will likely create only one universe.

Why would that be? I see no reason why God might not create parallel universes


Electrical bill and GPU shortages in God’s reality could be a reason.


It's a bit naive... But the best argument for me that we are living in a simulation is that we went from Pong to pretty good VR (good enough that if you have a beer or two before using you can forget its VR for some period of time) in 50 years. In another 50 years it seems fair to assume that we will be able to create VR that fully immersive and impossible to distinguish from real life.

Even with no other arguments about the benefits of WHY one would want to live in a fully simulated world... It seems probable to me that we are based on the idea that it could be possible.


> In another 50 years it seems fair to assume that we will be able to create VR that fully immersive and impossible to distinguish from real life.

Technology growth is always non-linear. it's also fair to assume we could stagnate for 50 years also.


we don't even need to be able to calculate everything, we just need to fool you! The Truman's show meets the Matrix.


If you want to solve that nagging thought, pick up Griffith‘s intro to quantum mechanics textbook. Goes through the philosophical implications of qm alongside learning the physics. The world as we know it is non-deterministic thanks to wave functions and their random collapsing!


I went through the same phase at 12. I am nearing 18 now, and I am very thankful for nondeterminism.


Specifically, I'm referring to Autoverse, the artificial simulation of a single bacterium down to the atomic level.

It was such a fascinating idea that I found myself more than once trying to mimic the atomic part at a much smaller scale over the years.


> Permutation City, here we come.

You might enjoy the show Devs:

https://en.m.wikipedia.org/wiki/Devs

Fair non-spoilery warning, there is quite a bit of creepy existential angst.


> there is quite a bit of creepy existential angst

seems to be a trend across all genres nowadays


Sure. Though most of it isn't literally existential.


Nvidia and Facebook are working on building the world’s largest super computer. The stated reason I heard was to power the metaverse but that strikes me as a red herring.

Things closer to this are probably what they want to do.

A slighty more pessimist take would be that they want to manage and crunch all that personal data they have more quickly, or for others that have lots of data to crunch that Facebook might benefit from managing/having (governments).


I wish they detailed how independent the simulated processes really are, and what sorts of dynamics are lost by not considering the atomic level.


[deleted]


I think it has to reproduce to qualify.


Now we just need to scale this by a mere 37,200,000,000,000x and we'll have simulated the entire human body!


Moore's law suggests it will be possible in 90 years if the historical trend holds true.



Luckily, there are other means of performing computation than just silicon transistors.


Price and energy use can still go down even if transistor density stays the same


Apart from this article, do we have any visual representation [a CGI, may be] of the full activity inside a cell?


I have always been amazed by these 2D representations: https://www.digizyme.com/cst_landscapes.html


one note- as lovely as those are, they don't make the point that everything in the cell (all the proteins, etc) is constantly grinding against each other (there's almost no room for water).


That's fascinating, thanks for sharing!


All of this incomprehensible complexity just so our genes can compete against other genes in their mindless drive for survival. It's kind of sad.


Things really got out of hand after that first self-replicating gizmo, didn't they?


The boilerplate to make a double click with your mouse do something relevant is also completely mind blowing. #complexity


I always appreciated the work of David Goodsell at UCSD: https://ccsb.scripps.edu/goodsell/

He paints cell internals.

I also like the Biovisions videos from Harvard:

https://www.youtube.com/watch?v=VdmbpAo9JR4


Youtube proposed me that one, which was really nice too: https://www.youtube.com/watch?v=DR80Huxp4y8


We can't visualize what we do not know. Full activity inside cell is not known and we are pretty far from knowing that.


What do you mean? Didn’t they just simulate the full activity of a cell on a molecular level?


It seems they did not. For more details please read this thread: https://news.ycombinator.com/item?id=30114645


The thread you linked to seems to confirm this is a molecular level simulation.


vmd is a standard biological system simulation rendering software


I don't see in the article why this is useful or what is it used for?


For instance, the cartoon version of DNA that is presented to even lower-year biology undergraduates is of linearized strands. But of course, it's really all spooled and tangled and crunched up into the nucleus of cells. Note that the cell they simulated is of a prokaryote (no nucleus, much simpler cellular processes than e.g. a mammal cell). About 1-2% of our genes make proteins, although the proportion is much larger in single-celled organisms (less redundancy in the genome, no splicing, etc.) So when you hear that e.g. genes turn on or off, this is not a switch. It's literally some sections of DNA being unwound, and large complexes of mutually interacting molecules probabilistically glomming on and off the DNA. The actual 3d layout of this DNA "ramen" matters to e.g. bring promoter regions of genes close to the actual genes they control.

So basically, we have a schematic-level understanding of cellular processes, but to see the actual 3D interactions in realtime would be extremely illuminating.


I should say that this work is not simulating things at this detail. Instead, it's more like a biophysical model of a bunch of chemical reactions with rate information. It probably boils down to a big system of coupled differential equations, at different timescales. So, it's a statistical level of detail, but still very informative.


It's generally not possible to see where everything is in an actual cell, in realtime, due to the sizes of the components. So most of molecular biology relies on very clever lab techniques to indirectly infer what cells are making and doing.

Cells are like little cities in terms of the complexity of their biochemistry. We want to ask questions like "How does this cell respond to this chemical/drug/change in environment."

Imagine trying to understand in detail the gun crime epidemic in a city, if you can only see objects larger than 100 m on a side. You wouldn't see people, cars, or many buildings.

We want to be able to understand, explain, predict, and control cellular process, but so far we have to be quite indirect. Understanding these things at a mechanistic level, in realtime would revolutionize our ability to understand, repair, and build biological systems.


i don't even know where to start. you could simulate one cell, then two cells then 4..suddenly you could have an organisms...hell you could see organism that could have lived on earth.

maybe it'll one day help with cancer research.


Oh I see, I didn't realize it extrapolated out that way, this was a good explanation that makes a lot of sense to me, thanks. Also.. that's f'ing cool.


Its simple really. First you simulate a single cell, then a sperm and an egg cell. Then you simulate a virtual a world of virtual captive humans to do our work for us without payment.


Someone is simulating all of my cells - and yours too ;)


If this is true, there must be some species at some level of simulation who's not being simulated.

I'm not sure if you're being real or not, but if you are, do you think the species running who made our simulation are also being simulated?


> If this is true, there must be some species at some level of simulation who's not being simulated.

You can't fool me, it's turtles all the way down!

With that out of the way, I'll observe there is no reason that such a base layer of reality need bear any particular resemblance to ours except in the tautological sense that it would need to be Turing complete in order to be capable of hosting a simulation.


I agree that it would probably not resemble our universe. I would think it has to be a universe that's capable of simulating our universe without consuming all of the host universe's resources as it would need at least some sort of species that would want to simulate our universe. At least initially.

I'm not sure what you (and other people) really mean when you say our universe is simulated.

- Do you mean that the entire universe is simulated down to the planck level? - Do you think there's some sort of optimization going on? - Do you think it's done by a species that evolved to become curious to see what would happen if you simulate the universe (like us)?

I can say that our universe is simulated too, but I have no idea if this simulation was made by someone or if it "just is".

But if you believe the universe is a simulation in some host universe, then it must be possible to have a universe that "just is" / or is Turing complete as you put it.


I mean that such a universe could be so different from ours that the idea of 'species' may not even be sensible.

> Do you mean that the entire universe is simulated down to the planck level?

Unspecified. Perhaps gross approximations are used unless an attempt is made to observe (internally or externally) more detail.


> I mean that such a universe could be so different from ours that the idea of 'species' may not even be sensible.

Alright. I've heard people say they think our universe is being simulated because that's what we would do. For those who think that, the host universe is at least somewhat similar to us.

> Unspecified. Perhaps gross approximations are used unless an attempt is made to observe (internally or externally) more detail.

But if gross approximations are true, that reveals information about the host doesn't it? If they resort to approximations because they don't have enough resources, that tells us they must really want to do this for some reason. Did they want to create our simulation for fun? Out of desperation? Are we made for research purposes? All those questions point to something human-like in my opinion, and thus "species".


Have to take this opportunity to re-share one of my favorite dreams. From back in the early 90s when I watched Star trek regularly. I was on the bridge. A crew member. The captain shouted some order to the crew. I shouted "belay that order" The captain said "I'm a level 5 - I know what I'm am doing". I replied "I am a level 12, so you should listen to me".


Could be an Ouroboros, the entirety of existence being created from nothing in an enormous circular dependency. It sounds farcical but when you think about why the universe exists in the first place, it seems as good a reason as any.


> when you think about why the universe exists in the first place, it seems as good a reason as any.

I think this sums up how I think. If any reason is as good as any then it's equally likely that our universe is not simulated and not an Ouroboros.

It can be a lot of fun to speculate and think about though.


Just kidding - mostly. But there are legit scientists looking for simulation artifacts.


https://www.scientificamerican.com/article/confirmed-we-live...

"This helps us arrive at an interesting observation about the nature of space in our universe. If we are in a simulation, as it appears, then space is an abstract property written in code. It is not real. It is analogous to the numbers seven million and one in our example, just different abstract representations on the same size memory block. Up, down, forward, backward, 10 miles, a million miles, these are just symbols. The speed of anything moving through space (and therefore changing space or performing an operation on space) represents the extent of the causal impact of any operation on the variable “space.” This causal impact cannot extend beyond about 300,000 km given the universe computer performs one operation per second. "


The question is (again) how soon now until I can boot the "ROM" file of my DNA in an emulator?


This won't happen. Computationally inconceivable with all that we know at the moment.


the idea is great but the specifics are oversimplified.


Devs, the prequel.


This seems like the company that will dominate the 2020s. The time is ripe to join NVIDIA


Nvidia GPUs enable nothing, because you can't buy any at reasonable prices.


This argument makes no sense. Consumer GPU pricing (which I'm assuming is what you're referring to) has very little to do with the pro market (industry, research etc.)

The researchers are using things like the DGX or RTX A-series. These, while quite expensive, are not that unreasonable when it comes to pricing.


An individual could afford computing power for such research activities (not exactly like this one, but e.g. for personal ML experiments) in 2018-2019 for an adequate price. You were able to buy 2 new RTX2080s for the today price of a used single unit. If you want to tinker and need GPU power today, your best option is to rent special datacenter-approved(tm) GPUs for the really expensive $/h. And you don't own anything afterwards (except if you bought GPU before the end of 2020). Does this make no sense? Is this how technological progress should work?


2080s? With only 8GB of VRAM that's not even ECC backed?

Even for ML model training back then, 8GB was on the small side (a lot of the research repos even had special parameter sets to allow running on consumer level VRAM GPUs). Also, for something like long running bio simulations, you'd probably want to be sure that your memory bits aren't being flipped by other sources -- the extra upfront cost is well worth preventing potentially wrong research results...

Nvidia consumer products have been a better value proposition in the past for sure. But they've always done market segmentation. It's not merely a matter of "datacenter-approved(tm) GPU" (though they do also do driver-based segmentation).


If you don't care if some rando who's machine you rented does see what you are doing vast.ai can be a good resource for GPU compute too.


"That place is so crowded that no one goes there anymore."


Leela: Did you drive much in the 20th century, Fry?

Fry: Nobody in New York drove, there was too much traffic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: