Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Amazon EC2 with Hadoop is 4 times more expensive than running on our own cluster (deepvalue.net)
13 points by phaefele on Nov 23, 2012 | hide | past | favorite | 10 comments


Are you implying Amazon web services is a high margin business? That would be good for Amazon stocks

I agree virtualised server performance is lower than equivalent hardware, but order of 4 magnitude seems very high and improbable.

Assuming you don't use S3 locally, I don't see EMC mentioned anywhere in your local cluster, we should remove that in comparison


Some of the difference comes about due to the number of cores in the machines we purchased - they are 16 core machines (each E5-2650 has 8 cores.)

In terms of storage, we are utilizing HDFS with commodity 3TB drives, thus no specialized storage from EMC (or the like.)


This is high margin business, look Amazon P/E ;-)


I am not surprised by your calculation. Amazon has to earn from the service it provides. It is not on the basis of no-loss no-gain. It is well known that the EC2 is great if you have to quick start your system without worrying about the hardware infra. It is also good for initial quick scaling. And it is good to some extent of scaling after that it is not advisable? If would be cheap, then why not other big companies will use amazon rather than managing their own data center?


Understood that Amazon needs to make a profit, but these are still dollars that I need to pay for, that I could be feasibly be using for something else (hiring great developers perhaps.)

I think the advantage of EC2 is exactly fast scaling, and an easy sell to management (no big upfront.) But as we started using it at size and consistently, it is just cheaper to run our own hardware as the blog attests to.

We do all the racking and cabling using "remote hands" directed from our SAs in India, and so far for us this has not been the complex part. Getting the Hadoop configured and our software running efficiently if several orders of magnitude harder, and EC2 doesn't I think help here. If anything it hinders as we are dealing with virtual hardware. There have been several posts about "lemon" EC2 instances and how you should test your instance before using it.


Using EC2 also takes away a part of the risk. If your startup fails in a year, then you don't get stuck in the end with a pile of hardware (for which you paid big bucks).

I guess that the best would be to use EC2 in the beginning, when you don't really know how much hardware you need, and later on use EC2 only for demand spikes.


Yes - this is what we are doing. With this 4x cost differential however, we calculated that it takes only 6-8 months to have the hardware pay for itself, so if you plan to be in business in that time frame you are better of buying.


I would also suggest taking into account other cloud options. Amazon is very often not the cheapest option. Just compare it here: http://www.cloudorado.com/


Nice analysis, ec2 certainly comes with its own problems, Like difficult to manage and hard to integrate ec2 hadoop cluster with private cluster


We have run into issues with VPC being tied to a specific zone and the availability in that zone.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: