At my not Google job we talk about "what happens if a meteor hits a DC".
We agree that that is so rare that as long as there are buttons we can push to recover after a reasonable timeframe that is an acceptable risk, we don't need a fully automatic way to recover from that.
However your SRE teams needs a way to recover without intervention which is why there is talk of backups.
BTW even using different cloud providers isn't enough to avoid a DC outage necessarily. No amount of redundancy can protect you from it beyond a ton of services which intentionally slice off access to the DC leading to the risk of that happening accidentally which is its own risk.
But that’s incredibly stupid. A meteor hitting a DC is an intentionally dumb way to eclipse the much more likely risks of thousands of other things that can wipe out a DC.
I’ve been involved in such discussions and on the surface they seem reasonable but it turns into an easy reason to write of DC being wiped out.
In comparison there are far more likely reasons for a DC to be wiped out effectively permanently. Data centers at the base of WTC 1 and 2 are good examples. The myriad of targeted attacks on the power grid are also prime examples of attacks that would cripple a data center for weeks if they were in the cross chairs.
The Cascadia subduction zone has a much higher probability of wiping out all of them in the PNW simultaneously than a meteor hitting a single one.
Looking at it from a different perspective, there are other teams at Google who understand how sensitive their data centers are and they act appropriately. The list of data center locations is not public. A motivated group with long range rifles purchase from Walmart could wipe out a Google data center.
We agree that that is so rare that as long as there are buttons we can push to recover after a reasonable timeframe that is an acceptable risk, we don't need a fully automatic way to recover from that.
However your SRE teams needs a way to recover without intervention which is why there is talk of backups.
BTW even using different cloud providers isn't enough to avoid a DC outage necessarily. No amount of redundancy can protect you from it beyond a ton of services which intentionally slice off access to the DC leading to the risk of that happening accidentally which is its own risk.