I am not surprised by your calculation. Amazon has to earn from the service it provides. It is not on the basis of no-loss no-gain.
It is well known that the EC2 is great if you have to quick start your system without worrying about the hardware infra. It is also good for initial quick scaling. And it is good to some extent of scaling after that it is not advisable? If would be cheap, then why not other big companies will use amazon rather than managing their own data center?
Understood that Amazon needs to make a profit, but these are still dollars that I need to pay for, that I could be feasibly be using for something else (hiring great developers perhaps.)
I think the advantage of EC2 is exactly fast scaling, and an easy sell to management (no big upfront.) But as we started using it at size and consistently, it is just cheaper to run our own hardware as the blog attests to.
We do all the racking and cabling using "remote hands" directed from our SAs in India, and so far for us this has not been the complex part. Getting the Hadoop configured and our software running efficiently if several orders of magnitude harder, and EC2 doesn't I think help here. If anything it hinders as we are dealing with virtual hardware. There have been several posts about "lemon" EC2 instances and how you should test your instance before using it.
Using EC2 also takes away a part of the risk. If your startup fails in a year, then you don't get stuck in the end with a pile of hardware (for which you paid big bucks).
I guess that the best would be to use EC2 in the beginning, when you don't really know how much hardware you need, and later on use EC2 only for demand spikes.
Yes - this is what we are doing. With this 4x cost differential however, we calculated that it takes only 6-8 months to have the hardware pay for itself, so if you plan to be in business in that time frame you are better of buying.
I would also suggest taking into account other cloud options. Amazon is very often not the cheapest option. Just compare it here: http://www.cloudorado.com/
I agree virtualised server performance is lower than equivalent hardware, but order of 4 magnitude seems very high and improbable.
Assuming you don't use S3 locally, I don't see EMC mentioned anywhere in your local cluster, we should remove that in comparison