> The first step is potentially changing the way data is assimilated into AI-based models. At present, they almost universally use a set of initial conditions produced by a physics model. That is, a model like the ECMWF spends an enormous amount of computing power to collect data from buoys, surface stations, weather balloons, airplanes, ships, satellites, and many other sources and then synthesizes a set of initial conditions for grid points across the planet. All models then take this as the beginning "state" of the planet's weather and forecast from that.
So this is essentially learning the time-stepping part of the physical model, not deriving predictions from raw data. While still interesting and probably still complex, this is far less impressive than the title lead me to believe.
You think the difficult part is merging observations with the last forecast? I guess it's a very underdetermined problem, but isn't the loss function (compare the forecast grid with later observations) the same whether you're doing grid_t0 -> grid_t1 or (observations, grid_t0) -> grid'_t0 -> grid_t1? I don't know enough about ML to know how much complexity the extra step adds, but doesn't seem like a massive difference.
Observation assimilation is a huge field in and of itself. Observables have biases that have to be included in assimilation, they also have finite resolution and so observation operators need to be taken into account.
I just assume every AI headline is one damn company or another trying to juice their stock price by finding someplace in their product line to shove an LLM. I'm right more often than not.
> The first step is potentially changing the way data is assimilated into AI-based models. At present, they almost universally use a set of initial conditions produced by a physics model. That is, a model like the ECMWF spends an enormous amount of computing power to collect data from buoys, surface stations, weather balloons, airplanes, ships, satellites, and many other sources and then synthesizes a set of initial conditions for grid points across the planet. All models then take this as the beginning "state" of the planet's weather and forecast from that.
So this is essentially learning the time-stepping part of the physical model, not deriving predictions from raw data. While still interesting and probably still complex, this is far less impressive than the title lead me to believe.