Traynor was quoted in a networkworld article last year saying they aim for three and a half nines (99.95%). But you need to read into the incidents more carefully -- figuring out actual "uptime" is quite hard. Consider the longest-lasting incident:
"On Tuesday 23 February 2016, for a duration of
10 hours and 6 minutes, 7.8% of Google Compute Engine
projects had reduced quotas. ... Any resources that
were already created were unaffected by this issue."
I'm not sure off the top of my head how I'd try to compute the overall availability #s from that one. One can possibly try to determine and sum the effects on the individual customers, but we can't from the information provided. But it's certainly less overall downtime than just counting it as a 7 hour failure.
Agreed. It is difficult to tell. But if the bug is preventing you from processing (because you can't save the existing results) then it's essentially down time for new processing. There are also connectivity issues by region and DNS issues. It is difficult to get exact downtime considering partial failures.
That said, this is the second major asia-east1 downtime in 90 days: