Interesting paper, thanks for sharing! Whenever I see physics-based inverse design using ML surrogates, I always ask, “why not optimize the problem directly” (and eg compute the gradient using an adjoint-variable method)? The paper implies that the forward simulation process isn’t differentiable, but is this true? Thanks!
First off, the ML surrogate is many orders of magnitude faster than direct kMC simulation. Each global optimization involved up to 500 sequential predictions of emission intensity with different nanoparticle structures (and gradients - I'll come back to those). For the largest particles that we optimized, running kMC on only the final optimized structure (for validation) took three months! So running 500 sequential kMC simulations would take approximately 125 years - not ideal. In contrast, all 81 global optimizations took two days with our trained hetero-GNN.
Back to the gradients - running the kMC simulations involves generating four different randomly doped 3D nanoparticle structures according to the shell thicknesses and dopant concentrations for the structure of interest, and then running four kMC trajectories with different random seeds for each structure. The final predicted value for emission (based on the number of emitted photons within the energy range of interest) is averaged over all sixteen trajectories. kMC is fundamentally stochastic, and not differentiable. But even if you could make it so with the adjoint-variable method (which I am admittedly not super familiar with), or maybe by writing your kMC code in a language with foundational auto-diff capabilities, once you move from a description of layer thicknesses and dopant concentrations to an ensemble of 3D structures/trajectories, I still don't think that the gradient of emission with respect to layer thicknesses and dopant concentrations would be accessible.
> So running 500 sequential kMC simulations would take approximately 125 years - not ideal.
Ah I see. Ya it’s hard to get around this…
> is fundamentally stochastic, and not differentiable
This is actually a common “inverse design pattern” for a variety of applications, and luckily there are tricks to efficiently compute gradients here. In my domain (nanophotonics) we’re often simulating incoherent sources, which are similarly stochastic. But there are ways to reformulate the problem to drastically reduce the number of forward and adjoint solves you need during the design process [1].
That being said, given the cost of the forward problem (3 months per iteration!) this doesn’t help you much…
> They require significantly different fabrication processes, and we don't know how to fab them into the same chip as electrical ones.
There are actually a few commercial fabs that will monolithically integrate the photonics, analog electronics, and digital electronics, all in the same CMOS process. See for example GF’s process:
Integrating good optical sources in silicon remains a challenge, but companies like Intel have mastered hybrid bonding and other packaging techniques. TSMC too has a strong silicon photonics effort.
I wonder why ORNL used AMDGPU.jl directly rather than something like KernelAbstractions.jl, which doesn’t require you to overspecialize to a particular architecture. I realize Frontier is all AMD. But the DOE labs have flipped back and forth between HPC platforms a few times. (Which is partly why Sandia invested so heavily in developing Kokkos).
Given the amount of code involved, perhaps just out of fear the resulting code generation might not be as good? Rather a quick job to write something like that, but the scaling tests take a bit more time.
I demonstrated 3D printing of metals using dynamic stencil deposition in a thermal evaporation chamber during my PhD about 18 years ago. Separate comment has details.