The cloud is a vehicle for hourly billing and instant provisioning.
If you are not actively using either of those features, you should look into dedicated or colo. Now.
On every other front besides billing/provisioning, it will lose to dedicated: Speed, Price, Performance, Server Specs and control.
The cloud should be used to handle unexpected workload or random jobs only. If you are running your 50x database cluster on ec2 for 100% of the month, you are doing it wrong.
There's a good selection of dedicated hosting providers that provision physical servers in an hour or less. Softlayer has diagnosed and replaced failed hardware in my servers in under an hour a few times. If you take care of your own backups and configuration automation, then the advantage tips back to dedicated servers at small scale.
The metaphor that I used when describing this to my boss was that sometimes you need a traditional hotel, sometimes you need an extended-stay hotel, sometimes you want to rent an apartment, and sometimes you want to buy a house. It all depends on where you want to be and for how long.
Right now, we're in the extended-stay hotel phase. It doesn't mean that people who buy their own homes or stay in traditional hotels are doing it wrong.
You're ignoring the financial aspect of when that money needs to be paid. With AWS it's billed gradually over the lifetime of the servers (and if you have too many, you can easily reduce overhead with relatively little lost value.) Buy those servers, and you either have to pay up front or commit to a lease that may have breakage costs.
Additionally, if your service is growing at a material rate, there are inefficiencies around when you choose to turn on extra hardware. With colocation, you're probably going to do groups of machines at once (say once a quarter) and attempt to predict how many you'll need (naturally erring to the high side.) With cloud, you can provision new machines at any time _as needed_.
It's great to do a set piece calculation and say colocation is cheaper, but you're ignoring the realities of doing business - that plans change regularly. That flexibility is one of the primary benefits of using cloud services.
Ask most CEOs if they'd rather pay 300K now, or 400K over a year or two with a lot of optionality/flexibility, and I suspect they'll take the latter.
Cashflow is generally fungible - there are ways to make either strategy work such that you defer cash payments (this is why companies mostly actually lease colocated hardware.) But one leaves you more flexibility than the other, and that has real value in the real world.
Using reserved instances would push the EC2 figure down 30-40% (and like the dedicated option, provide further cost savings in years 2 and 3)
You can't assume that the marginal increased technical management cost is zero. If that's true, you're employing people that aren't doing anything productive with their time. A dedicated cluster of this size would likely consume 50-100% of one employee's time, which adds at least another $50k to that side of the ledger.
Yes, but if you have a cloud cluster of this size, it would still require similar time for systems administration, not? The only real difference here are hardware failures and replacements, but for a cluster of this size, I do not think that enough hardware failures will occur to occupy a technician that much compared to the cloud.
I think this is a false economy. You're just paying your provider to employ people that are capable of fixing these problems instead of in-housing them.
By and large, if you manage physical, owned infrastructure the same way you manage cloud resources. That is: automated configuration, monitoring, and deployment. The vast majority of the time it's the same process. You just need to make sure that your instrumentation is sufficient to notice when things aren't running correctly and remove machines from service until someone can figure out what happened. This could be in the form of an outside tech or in house support. Either way, at clusters of this size a loss of one, or even 5 machines should not be the difference between everything working and a colossal systems failure.
I don't think it's safe to assume that paying for that service being done by someone maintaining 1000+ machines is necessarily going to be more expensive than paying to do it in-house for 10 or 100. It's generally going to be significantly more cost-efficient on a per-machine basis to have someone administrating 1000 machines vs. 100. If nothing else, the cost of specifying and purchasing hardware and getting staff up to speed on it is going to be spread much more efficiently, as is the cost of 24x7 oversight and backup hardware. Not to mention they're probably buying all that hardware significantly cheaper.
I think we are agreeing here, we're just phrasing it differently. I didn't mean to imply there was a direct cost comparison; just that part of your payment is for them to fund someone's paycheck. In the grand scheme of things it's definitely cheaper to maintain a high ratio of machines-under-maintenance to systems administrators.
That being said, the difference is Amazon pays an administrator to watch potentially thousands of machines of which you may be using a tiny fraction. You're essentially paying for a very tiny fraction of an admin's time (somewhere) who isn't vested in your business nor does he care that your site is acting wonky. He's just answering and working through tickets (if you bought support). Cloud providers can be come markedly less helpful the instant you (attempt to) deviate from the very carefully maintained cookie-cutter pattern they've set out for you.
Working at a company with hundreds of machines and dedicated admins for them, I can tell you that unless the problem has immediate revenue impact, they're not going to handle it a whole lot differently from a cloud provider - you have to cut a ticket, it's prioritized, and it goes in the queue.
DanBlake I share your same sentiment. We went the real servers route and when someone does if they use programmatic methodologies for system administration (Puppet, Chef) you can have systems online just as quickly or systems reprovisioned in to different roles very quickly. We call it our own private cloud!
Edit: Note we are using an automated Debian preseed with Puppet installing from an internal repository to bring the systems online with minimal sysadmin interaction.
On the flip side the $100k cost for buying dedicated equipment is a first-year cost only, and the equipment can last up to 10 years, where at AWS $350k/year is every year.
I've generally found that hardware has about a 10-month payback aginst AWS, though I've tried to estimate my own (sysadmin) time cost to build out the datacenter, as well as just cash cost.
I'm glad to see someone else is coming to similar conclusions.
Anecdotally, I've found that the big cost is up front, which is why it's daunting many companies to make the change. What I find less comprehensible is the desire to "move to the cloud" from an existing full-stack infrastructure, as if replacing aging server hardware costs more than paying Amazon.
Your 10 month payback calc falls right in line with my figures of 9-12 months, depending largely on the number of man-hours required to get going.
If you've never vetted a DC or picked hardware before, twelve. If you know of three good DCs, have negotiated contracts before, know that you want a pile of HP DL360s, and know where to buy them, nine.
>I've generally found that hardware has about a 10-month payback aginst AWS
no wonder here. The AWS has their own business people who calculated how much they would need to charge to get profit from the hardware investment loosing its value right down to the neighborhood of $0 in 2-3years.
That's a big "might," and it adds significantly to the cost.
If I'm buying all brand-name Dell or HP, I expect it would be relatively easy to get a low interest rate without much hassle. However, if I'm also buying different brands of network hardware and, say, rolling my own high-performance storage[1], it's another matter.
Regardless, even leasing can cause "sticker shock," and, perhaps more importantly, still requires all the up-front time cost.
[1] Such as SuperMicro enclosures with commodity disks, rather than all brand-name Dell, which can easily double or triple the price, even with a steep discount.
That ballpark calculation only works on a particular subset of systems.
For example, on per-hour services it's nice to bring up big environments replicated for Dev and QA. Or to do incremental updates with full A/B live testing, with the old system eventually replaced.
Also, AWS has many services that replicating takes a lot of effort and planning if you do from the ground up. Like load balancers and monitoring. In fact, it would be smart to play with an AWS system before buying a whole stack of BigIP/Cisco/EMC/IBM.
EBS is terrible, EC2 is dog slow and the relational MySQL thingie probably is unreliable. But everything else has showed to be very stable for a long time with very heavy users.
I'm upset at Amazon's terrible communication, but it's still the best option for starting anything bigger than a php webhosting plan.
As you note in the article $105/mb for bandwidth is extremely pricey.
With a 4 rack commit most colo providers would just throw in the 54mb of bandwidth for free. That being the case you would save a further ~$100k a year.
Well, I would say two things in response to this... that if anything the costs totaled up here are pretty inflated (read: vastly so).
First, you generally aren't going to be paying (or paying much) for reasonable connection speeds at colocation facilities. As someone noted already, many times it's included with a large enough contract. I know in my case we're essentially paying $2.5k/month for a full cabinet and a connection... essentially dirt-cheap and not in a rinky-dink colo facility either.
As for the hardware, you have to figure that to spin up 50 application servers on 50 sessions at Amazon is nowhere near the same as 50 sessions on your own hardware. If you virtualize, like the EC2 backend is of course, you're not sharing the hardware with anyone. You don't need to worry about noisy-neighbors or I/O issues if you've purchased the right hardware. Essentially I'd go out on a limb and say you could at least halve those hardware numbers.
In my experience from physical to virtualized systems, even under high-load situations 90% of your 'load' issues are not going to come from processors, but from memory limitations, so yeah... hardware costs I expect will be lower than calculated... much lower.
From a "what you get" perspective I totally agree with you. I simplified the comparison to core count to have an easier basis to setup the rest of the analysis.
I would not be the least bit surprised if I halved those hardware numbers and it kept up just fine; mostly for the reasons you pointed out.
To me the IT people who bought into cloud hosting are the same type that would gladly buy a vacation time share: you end up paying too much to rent a slice of a resource that is never really yours, when you could've used that same money to outright own something much better.
When people tell me how great their vacation time share they "own" is I ask how often do they actually use it (3 weeks per year) and how much are they paying (a lot) then I point out that they could've booked first class air travel and a top floor penthouse at a 5 star hotel at a luxury destination for 3 weeks each year for far less money than their time share costs and fees. I'm glad someone is clearly pointing out that you can make the same case with cloud hosting vs. colo hosting.
Why is owning physical hardware so scary all of the sudden? Dear Lord, is it really that difficult to rack a physical box and replace the hard drive once in a while? AWS marketing deserves a Gold Medal for Industry-wide Brainwashing. The cloud is a time share!
I don't understand your analogy. One case involves making yearly payments regardless of use, while the other involves only paying when you use it. But from here it looks like cloud is to colo as hotel is to timeshare (or purchasing property).
As a small startup I'm happy to host in the cloud because it removes a lot of the up-front risk of buying physical hardware. Just as I get a hotel when I go on vacation because I don't want to purchase a beach house outright. But at scale, the economics change and it most likely goes the other way.
I had Dell's 24x7x365 4 hour, on-site service contract for a rack of 16 identical 1U servers. "4 hours" was for them to respond to my call (phone or email, I can't remember). Their response was not showing up on-site with the part, but asking me to run a long series of tests, including brining the machine down, swapping out parts, resetting the BIOS, etc. Once I did all that, they ordered the part, then scheduled a delivery. It was 3-4 days, minimum, to get something fixed.
For the $1,909 price per server: I followed the link to Dell's site and tried to configure the R515 for myself. I added the second processor and left it set to the worse one, added the cheapest 16GB memory option, the redundant power supply and rails. That was $1,999.
A single, 250GB HD seems a bit "lite" for a server, even for one that does mostly processing. I didn't look into the networking or anything. I'm guessing at least another $1k to make it reasonable, probably more like $2k.
A single, 250GB HD seems a bit "lite" for a server, even for one that does mostly processing.
I've found that 250GB is way overkill for a machine that does mostly processing. Consider web servers, which need a copy of the code (which hopefully isn't over a gig), and the operating system install for a server should fit in under 10GB. The bulk of the data they'll be processing will be in remote services like databases. If you have a decent logging infrastructure setup, where you regularly ship all logs off the machine, you need enough storage space for the logs for the rotation period -- if you ship them off immediately (say with scribe), you need enough local storage for the period your aggregator is unavailable (which can be minimized by having even scribe log aggregation load balanced). If these machines are in clusters and you can survive with some of them being out, you don't even need RAID on them. It's kind of unfortunate that the smallest drive you can buy leaves it 80%+ empty, because I have a feeling that drives that were smaller and not optimized for speed and capacity, but still use modern technology, might be more reliable.
We've taken to putting smallish SSDs in our load balancers and other machines that don't need a lot of local storage, since having fewer moving parts (now the only moving parts are the fans) is a power-usage and reliability win.
When you get to the point of ordering several servers, it makes sense to just order a full spare. Standardize on a particular setup and keep a certain ratio.
Then one box goes down, we just swap HDs to another standby chassis. Back up in 10-15 mins, then you go dig into the problems with the other chassis.
How exactly do all these cost comparisons factor in the ability to quickly deploy hardware in 4 different regions in the world without having people at location to swap broken hardware or reboot an instance that is not available for whatever reason?
OK we just had a big Amazon outage and someone compares no redundancy costs vs Amazon. You need to double your dedicated hardware costs if you dont want to go down when the data centre goes down. Your Amazon costs include a whole lot more redundancy if you architect well.
What if EC2 is your failover solution? That seems like an ideal use case, considering the billing model. Keep one instance online at all times to clone the data, then spin up a few more to replace your servers when they die.
Or just build what you need in two datacenters with load balancing between them, and accept that operating at half capacity for a while is a lot better than being down entirely.
This is very theoretical and won't hold up in the real world. You also need to factor in the costs of hot standby equipment (can you provision a server with 68GB RAM within minutes?), service (whom do you pay and how much for a) servicing your hardware b) being on standby to fix issues within single hours).
You need to train your people that will maintain the physical servers. You need to factor in the probability of their mistakes (in my experience from a supercomputing center most problems were caused by people touching equipment).
And of course when you get mentioned on CNN you need to be able to handle the traffic peak.
Runnning a real world operation really isn't as simple as adding numbers from a colocation price table.
Nothing is black or white. From my past experience, when you start to have services that are i/o intensive, the colocation is a good option especially that you can easily control or tweak the underlying hardware. On the other hand, if you start with a small scale service and you don't have a large distributed datastore "cloud hosting" is often simpler and cost effective.
This is something I have been wondering about; it would be good to be able to have space in a cloud DC - where you can have a rack or ten of your own equipment as well. Thus being able to leverage both.
I don't know if it is offered at all by any of the cloud vendors, but it would be good to be able to install specialty machines/disk into the facility but still leverage all other aspects of the cloud provider.
An enjoyable post. I would add a couple of data points to the mix.
Getting a gigabit link with a competent IP-transit provider will be on the order of 3 - 5K$/month. Thats 1000mbits 24/7 not limited by how many bytes you push through it.
A switch and router for your rack stacks will be on the order of $15K (that's a couple of 48 port GbE switches and a Cisco router (or equivalent))
You really need to understand the depreciation costs. As your equipment ages you will need to replace it (if only to keep on supported platforms). $100K + $30K for servers + $15K for networking gear is $145K of gear. If you squeeze all you can out of it and only replace it in 5 years then you can do a 5 year straight line depreciation so add about $30K/year to your costs for depreciating the old gear.
On the storage array, if you want 10TB of raid protected storage with the MD1220 you need 24 600GB SAS drives [1] which comes in at $23K each (not $12K)(oh an you have two of those and you again have $10K/yr depreciation).
Oh and you probably want a service contract, something like onsite in 4 hrs or if you're a bit more laid back in 24hrs. That will add another $150K/year ( but I'm sure that you can get the sales guy to knock off a bunch as its probably a list price vs 'what i can get it for' kind of deal)
Another real world bit that will bite you is that while you can "fit" all this gear in a 40U rack you can't put enough power into that rack at a Colo facility to run it. The servers are 750W machines, so lets say you put a 120V/30A circuits into your rack, you can really only draw about 25A before people complain so you have about 3KW/circuit available. A 'normal' colo facility will offer you 2 per rack. So with 750W servers you can run 8 machines per rack. You'll probably not run them that hard and can get away with maybe 12 per rack. But with 54 totals servers that is going to be 5 racks minimum and maybe 6. (remember your switch and router will take power too). Either way you're looking at 24 - 30 'circuits' for this space and those are probably about $500/month each so another $12-15K/month in 'power+cooling' charge.
You pretty much have to add in either the cost of a tech or half the cost of one of your operations employees to run this setup. Ideally you have two people at half time so that you can structure vacations for them. So put it down as one full time sysadmin and one full time tech, implemented as anywhere between 2 and 4 people. Don't forget to include the cost of their office space, their health plans, and their laptops :-).
Did you include travel time and travel expense? So most things can be 'lights out' but many exceptions to that rule exist. If you can drive to the data center from home then you're better off than if you have to fly there and check into a motel.
All that being said, its an important exercise to run through and figure out the costs since it is your own money that you are spending. And AWS does get some economies from being able to fractionalize things like sysadmin resources.
> You really need to understand the depreciation costs
Huh? He's already paid for the gear. Depreciation isn't further payment due, it's an asset write-off which is actually welcome since it slowly turns the initial capital outlay into a tax deduction.
It sounds like you share a common mis-perception about what exactly depreciation means. Your statement "Depreciation isn't further payment due,..." is correct but this part "...it's an asset write-off" misses the point.
Assets have "lifetimes", which is to say that if you are using something in your business venture, it will eventually be "used up." In the case of computers they don't generally break but historically they get faster and can do more for the same amount of power approximately, historically doubling in performance every 18 months or so but now more like every 30 months or so. When an asset is "used up" you are going to have to replace it to keep your business running. So you're going to 'buy it again'.
Consider an alternate strategy, lets say there was a company that had a credit card that charged no interest and had a 60 month (5yr) payment plan (kind of like some car deals I guess). If the payment is 1/60th of the price of the gear each month, then that is clearly a bill that we pay every month. Now at the end of 5 years we "own" our equipment outright, except its 5 years old.
Do you remember what was 'hot' in the PC business in 2006? Sure you do, go to blekko.com and type 'site:pcworld.com /date=2006' and one of the results is "Hot technology in 2006" [1]. Looks like 750G drives, and "the fastest desktop chips we've ever tested, the Intel Core 2 Duo." So if you had bought gear in 2006 you would probably have machines based on the Core 2 Duo architecture which is pig dog slow compared to a decent i5 or i7 motherboard today. Guess what? You need to buy new machines to stay competitive. (or as this is a web enterprise at least 'add' machines to stay competitive).
Depreciation is a way of capturing a "future" expense in today's revenues. If you are profitable, even after including your depreciation costs, then you should be 'banking' those costs so that in five years when your gear needs to be replaced or added too, you've got the capital you need to do that. If on the other hand you aren't profitable when you include depreciation (a state known as being 'under capitalized') when the time comes to replace your gear you won't have the money (capital) to do that, and you will either have to raise money, go into debt, or fade into obscurity with increasingly out of date gear.
The bottom line is that rather than being "an accounting trick," including depreciation in your cost structure helps you understand the total cost of owning, and operating, a bunch of gear which is powering your business.
This is not about depreciation cost alone.
Normally vendor tie support cycle to depreciation cycle so you cannot get new parts or support unless you pay them heavily.
Sorry dude I don't know what you are talking about. The post talks about buying your own gear; the parent to my reply seems to misunderstand what depreciation is and goes on about some non existent recurring cost.
Are you talking about renting gear from vendors? This seems again to be a misunderstanding, no-one is renting anything. Support contracts are normally separate, certainly they are still available at normal cost within 5 years, the normal depreciation timescale.
It seems this is a really misunderstood topic. Maybe someone can do an "understanding depreciation for startup founders" post or something.
"Note: If you're going to disagree with one of my assumptions it's this last one. I am perfectly aware that a uniform duty cycle is unheard of when it comes to web applications..."
Well there you go. I know a Startup paying a good $12k a month on the Amazon Cloud and their multiple for daily peak hour to valley hour is greater than 10. So given your assumption backed up by anecdotal evidence (of which I have my own) sure, collocating is cheaper.
That's why I made sure to document that. I have some clients that have an even higher disparity between their peaks and valleys that makes cloud hosting very viable. Auction type sites are a great example. As the time to an auction ending approaches zero everyone starts mashing refresh, but if nothing is going on the activity is almost zero.
I just wanted to clarify it isn't a one-size-fits-all.
I don't get it, what about renting dedicated servers at a company like iweb or softlayer? much cheaper than colocation for startups, you don't have to worry about networking and hardware. and also much cheaper than cloud hosting.
(of course, it doesn't give you all the fancy features of cloud hosting such as flexible pricing and automated provisioning)
It was more of a comparison of the two extreme ends of the spectrum. Total hardware control vs. no hardware control. You're absolutely right that companies like iweb and softlayer offer great intermediate approaches. Personally, I have some great experience with softlayer but didn't want to make the article that long by including a third option/analysis :-).
Even still though, comparable hardware on softlayer's pricing looks like it's about $659/month. After 3 months it approaches a similar (but not identical) cost to the colo option.
If you need a lot of memory or disk space dedicated servers are not any cheaper, and often more expensive than cloud solutions. Hivelocity seems to be the only one with reasonably priced large memory dedicated servers, but they have a single facility in Tampa Florida as far as I'm aware which might be a problem for some businesses. If you need a lot of memory and disk space colocation usually has the best bang for the buck but you are also losing the ability to have new servers brought on quickly which can be a huge issue if you have a failure.
are dedicated servers that much cheaper than 'cloud'? last time I looked they were fairly comparable.
genuinely interested; I'm considering getting in to the 'instant provisioned dedicated server' market myself; there's no technical reason why dedicated servers need to take more than, oh, about sixty seconds longer than a virtual server to set up.
Dedicateds give huge bang for your buck - we have a pair at Hivelocity.net.
One's got dual xeons w/ 12gb of ram and 4x500gb in raid 10 for like $320 a month, which works out to 42 cents an hour.
The other's a single xeon w/ 8gb of ram and 2x500gb raid 1 for $160 which works out to 22 cents an hour.
Both come with 10tb of bandwidth a month and are exclusively used by us, nobody is messing with our disk io or anything else.
I don't know how they compare with AWS' compute units etc but if you have an ongoing need for the hardware and you're going to be paying those hourly fees (and all the others) every day of every month then I suspect it's going to be a lot cheaper than AWS.
During times when there's competition for spot instances that's going to be more money for less ram, less cpu, shitty disk IO (esp vs. our blazing raid10), and no included bandwidth. And at any time we might not be bidding enough to actually keep them...
"SPOT INSTANCE" is $0.2460 per hour right now. "ON DEMAND" is $0.50 per hour. Login to your AWS Management Console to see the current SPOT INSTANCE price.
Ah gotcha, thanks - assumed he meant that the price was listed on the page.
The points about 8 vs. 2 cores still stands, though? Add to that that spot pricing isn't really comparable to guaranteed pricing (on demand is more comparable since dedicated servers are on-demand, you can cancel and get your month prorated).
It's a classic lease vs. buy argument. If you only need a resource for a day, then relatively speaking, it's cheaper. However, if you pay the rental fee every day for a year, it gets more and more expensive than an outright purchase. The comparison largely depends on the time scale you look at.
The main reasons I'm aware of that provisioning dedicated servers take longer is because you have to deal with data center logistics. Which rack does it go in? Which VLANs does it need to be in? What bandwidth is the customer paying for and how will we route the uplink? If the customer has other machines can we put these new ones in the same rack without overloading the PDU?
There's a lot of hidden work that goes into data center management.
Sorry, I guess i wasn't clear. I meant renting a dedicated server, which I thought was comparable in cost to 'the cloud' - buying, of course, is more expensive in the short term and cheaper in the long term.
Personally, I think a /whole lot/ of the pain of owning (small amounts) of hardware is that most of the co-location vendors spend inordinate amounts of time trying to 'extract maximum value' (meaning, they fuck around on getting you a price, trying to get the max value you will accept.) I'm considering getting back in to the co-location business[1] in part because of this, and in part because it would let me scale to the point that it would help amortize out all the time I have to spend negotiating.
Man, I just spent three hours today talking to the owner of my data center, and he /still/ didn't give me a price. "tell me what you get from the other guy" - I mean, he's a nice guy, and I didn't really mind hanging out with him, I probably even got a little bit of useful information, but three hours of two guys, both whom can probably bill out at $100+/hr, and we still didn't arrive on a price for an operation as small as mine? (approx. 150a of 120v power and accompanying rackspace; 5 cabnets or so. This didn't even include bandwidth.)
I spent some time trying to put together some numbers. From the research it looks like cost pressure against traditional colocation options has forced the pricing to come down. It still looks like owning your own hardware is a viable, and if you have people, good option.
as demonstrated by the frequent data center outages, it's extremely difficult to have 100% uptime, unless you have a crack team of sys admin and operators who are better than Google or Amazon engineers in server management and data center management. Investing in hardware means investing precious time, engineering talent in issues you can invest otherwise in sales, marketing, product design or any critical aspect of your business.
Of course, if you find that hosting takes a large % of your costs, and if you are sure you can do it yourself with less costs , then it's time to have your own hardware.
Amazon provides power, servers, bandwidth and their network.
You still need administrators and ops people. If you colo, your data-center will provide power and bandwidth. You just need your cabinet network and the servers.
On top of that, Amazon costs more, and expose you to problems like the EBS outage that you simply don't have with colocation because you haven't gone and developed a (probably necessarily) complex provisioning system to manage it all.
The implication in your statements is clear that the act of purchasing, provisioning and maintaining hardware is a primary driver of your IT/Operations work-load, when in reality that's a minority concern at best for most. It borders on misinformation.
network or electricity or software will go down one day and then you will begin to think about disaster recovery plans and get another colocated space in easter or western US.
A simpler option is to rent dedicated servers in 2 different hosting companies, say softlayer (US) and OVH (France) and design fall-back mechanism. It will cost you less than amazon or owning your own hardware.
I agree with the calculation in the article for applications that have "shared state". However, if you are doing something that is a bit more amenable to the cloud, like serving assets or processing email then the calculation is very different.
The real killer in the calculation is in the assumption of a uniform distribution cycle. Any service that can break away from that looks very different. If you serve static video files, for example, you cannot hope to match the geo-distributed service AWS Cloudfront offers unless you are huge.
A smart mix of colo and cloud is the way to go for medium sized businesses.
Just wondering, how many users/usage can a set up like the one in the article handle? (For example, how many concurrent users for a web app that's a social networking site with mostly news feeds, comments and some pics?)
To be honest it's impossible to tell without actually measuring it. The #1 rule of scaling a system is to measure the variables your trying to optimize, make a change, and then take measurements again. So without knowing the kind of software that's running on this we have to leave everything in terms of a theoretical workload.
If you aren't doing graph operations, then 50 dedicated servers can do a pretty impressive workload. Add in graph-ops and all bets are off :-). The same could apply for user-generated content (eg youtube, facebook, vimeo).
Stackoverflow is another great example of a company that vertically owns their stack. Check out:
I did the cost calculations for a very very small business, and rackspace's cloud hosting was more cost effective option for us. I'm sure at some point that will no longer be the case, but we'll calculate that when we get to it. Even then, I'm always willing to pay a premium to let someone else manage hardware headaches.
Uh, no hardware failure costs? Both the time and money to handle such. Hardware fails, which costs time and money to fix. To not account for failure within a year is either delusional or using some extremely high-quality hardware which is going to cost a lot more than the linked to Dell server.
There's uhh.. a thing called a warranty which generally is sold with every-single-device-sold-ever that is enterprise grade. This is also where they make their money. If a hard-drive dies you call Dell and they ship you a new one. If a hard drive in a $40k+ Netapp dies, you call Netapp and they tell you their rep is waiting for you at the data-center and to hurry up. If your Cisco switch dies you call Cisco and they ship you a.... you get the idea.
as someone said above, if your Dell hard drive fails, they respond to you in 4 hours, then they ask you to run a series of tests blah blah blah, then you might get replacement that day or in a few days..
this is likely to cost at least 4 hours of a sys admin's time, maybe closer to 10?
So meanwhile your sysadmin has gone over to the co-lo to swap parts, and as mentioned elsewhere dealt with CS, shipping, and such. Real-world problems. The cost is not $0.
one problem we recently faced was we run Xen on remote servers.
We found out that our 3 year old storage hardware had some firmware conflict with Xen (or something like that)
took 2 weeks to diagnose + a plane trip out to fix, even then we weren't sure we got it.
after this, we moved to Rackspace.. there is no way that a sys admin running one rack of equipment is in the same position as Rackspace to diagnose and fix these types of issues.
i imagine if Rackspace/Amazon had this type of problem, they would:
1. Have 24 hour manned data centers
2. Have the input of a team of 20 engineers working on the problem
3. Have a Dell/IBM/HP engineer at the data center within the hour
4. Have lots of spares - no calling in new hardware
Having been a (part time) sys admin for years, the worst problems are those that are hardware related and are difficult to diagnose.
(The other advantages of Cloud, e.g. scalability etc are well documented. But i think my ideal setup would be dedicated, non virtualized databases with Cloud front ends)
I've been on the other end of this spectrum too. Working with a very high profile company (that I unfortunately can't name) we were paying what I'm going to call an astronomical rate for managed cloud operations. Dedicated data center admins, systems administrators, the works.
We would have a problem with something like IO throughput on database instances. We would open a ticket, wait a little while, get a response. If we claimed it was hardware related (because we couldn't tell from our host's perspective) we got the response "it's not the hardware, everything seems fine." This would go on for days. Then eventually, after we had to prove the numbers were erratic or unresponsive we would eventually get a more helpful response. Maybe.
It's a very cold splash of water in the face when you realize that your hosting company, cloud provider, whatever is not in business to hold your hand. If you need more than their minimum level of support or require human interaction you will be sadly disappointed. These companies maintain their margins by automating hardware provisioning, homogenizing infrastructure, and making it as turn-key as possible. Which is all fine and good until you need something their infrastructure doesn't provide for. Like switch bandwidth. Like larger instances. Like all your VMs in a local rack.
You will have hardware problems in the cloud too and they will not be obvious. You will need the same degree of monitoring software you have anywhere else in any other environment.
You will have hardware problems in the cloud too and they will not be obvious
I'm often hearing "in the cloud, one doesn't have to worry about that hardware" (or network). My usual retort is that one certainly does have to worry, since the same problems exist, just that one can't do anything about them when (or before) they occur, unlike with owned hardware.
Don't forget taxes. In most locales you will pay tax for owning those servers. Depreciating them as fast as possible will help but owning stuff as a business is hard expensive work from a tax perspective.
This is great information and in light of the AWS outage of last week - I would recommend a hybrid model. Just as you recommend AVPC be used for dev/testing - you could potentially achieve an ideal hybrid by deploying some smaller % of the dedicated hardware and having AWS fulfill elastic capacity needs.
Obviously this is determined by your application specifics, but there are a lot of deployment methods to consider.
The biggest issue I take point with in your assessment, however, is the networking gear cost at 10% being far too low.
You can get fair LB capabilities from low cost vendors like Coyote, but your switch infrastructure will likely be much higher than 10K unless you're doing some bare bones setup with 1U stacks and no redundancy.
Further, I would expand this model and add stand-by and failover hardware in the calc.
In which case I would round up to 60 servers and have a tertiary DB box as well.
Finally, I would add a support/contingency budget of 15% for emergency gear replacements.
In the case of staff -- there is a strong likelihood that you would need more staff to dedicated setups than you would with the hosted setups:
You need your staff to have more specialized skills in DB, routing, sys-ad etc. You also need to consider that you'll have more support costs for round the clock and on call support. While you have these costs with AWS, they are lessened as your staff are fundamentally in a reactionary state only with AWS - there is no proactive PM on hardware etc with AWS - you simply respond when outages occur and wait to regain access to your affected systems.
Staff on call would be required to be able to delve much deeper to root cause any outages you have in a dedicated environment, travel to the site and physically mitigate any gear failures.
Overall though, this is a fantastic perspective that everyone should have in Excel and type in their own numbers.
ok let me rephrase...for the vast majority of businesses using the cloud, and vast majority means small businesses since there are way more small businesses than big businesses, the cloud is much much cheaper.
the 'cloud' has enabled my business to grow. without it (despite the fact that our business is delivering customers to the cloud), i would not be able to grow and scale my business efficiently.
the author is talking about hosting an application in the cloud, which does not nearly encompass all the use cases for the cloud, and thus invalidates the argument.
If you are not actively using either of those features, you should look into dedicated or colo. Now.
On every other front besides billing/provisioning, it will lose to dedicated: Speed, Price, Performance, Server Specs and control.
The cloud should be used to handle unexpected workload or random jobs only. If you are running your 50x database cluster on ec2 for 100% of the month, you are doing it wrong.