The worst thing for any service provider is to delay the initial confirmation th...

moe · on April 22, 2011

There really is no negative side for someone as large as Amazon to immediately put up a quick notice that "we are receiving complaints about x-y-z

Do you have an idea how many such complaints amazon is receiving on a normal day? Per hour?

Why make every single AWS customer panic for an hour

Diagnosing problems in a big system is not that easy.

A turnaround time of an hour is not too bad for a behemoth the size of Amazon, and when you consider that this was a worst-case scenario.

cft · on April 22, 2011

Even with a large service, you can immediately identify a fluctuation in the number of complaints (in addition to signals from monitoring tools). Speaking from experience. In fact, the larger the service is, the easier it is to statistically identify an uptick in the number of complaints per smaller time interval

moe · on April 22, 2011

Even with a large service, you can immediately identify a fluctuation in the number of complaints

Immediately is a relative term. I would say "1 hour" is pretty much as immediately as it gets on these scales.

I wouldn't be surprised if a significant complaint fluctuation only manifested long after amazon discovered the problem in their own monitoring.

statistically identify an uptick in the number of complaints per smaller time interval

Yes, but volume is not everything. You also have to qualify (triage) the input, get engineers on the case, confirm the issue, perhaps get clearance for a public announcement. All the while many of the key people are busy either trying to figure out what is going on, or trying to dispatch information the right people, or just running around waving their arms furiously.