If you remove any rational analysis based on productivity, developer time, or fi...

KyleBrandt · on Nov 17, 2011

Okay, I'll bite...

> I spent a lot of time in the hot/cold, blaring fan world of data centers -- sometimes late at night, when a hard drive failed or a switch's fans went out.

If a hard drive or a fan failing requires someone's attention at the data center in the middle of the night something is wrong (unless your company has such high uptime requirements that even redundant systems need to be fixed right away). Either your company can't afford to provide proper redundancy, or someone on the technical side has failed to implement redundancy or failover properly (which happens, but should be corrected). For example we had an entire router fail and this just caused a blip. We didn't have to do anything right away to fix it, just figure out what went wrong and bring it back up the next day. Also we don't really end up spending very much time in the data center itself (in fact I work remotely full time).

> We like dorking around with hardware and data centers, just like someone else might link tinkering with their car or collecting bottle caps

We do like to consider ourselves experts -- we don't just dork around and tinker (at least, not most of the time)

> If you remove any rational analysis based on productivity, developer time, or finances, then we're only left with a question of personal taste

In part this is a matter of taste. But as I said:

"This culture means when we hire technical staff, we hire people who share this passion. I believe that this passion translates into a better product. Whenever someone does a cost analysis of cloud vs self hosting there is no row in the spreadsheet for “Work Productivity Increase due to Passion.”

Just because there is no good method to put it on a spreadsheet, doesn't mean it doesn't have value for the company. There are a lot of technical reasons we like to have control over the full stack. But I think it largely comes down to our culture. The culture of a company is very important. A culture could be all about how the numbers fall after you calculate your capital and operating expenses etc. Somehow it seems to me though that many people don't seem very happy in those cultures -- and this will effect employee retention and the sort of talent you can attract.

mwsherman · on Nov 18, 2011

For me, the best justification for “doing it ourselves” is that it’s what exceptional companies do. We dig under abstraction layers not knowing what we’ll find. We dig under Linq to SQL. We dig under Redis clients. We really never consider the vendor’s platform good enough.

The result of a million small optimizations is a platform that others can’t create.

kamaal · on Nov 18, 2011

Thanks for the honest reply.

But come to think of it in an another way. If you actually let some one else focus on the issues you don't necessarily need to handle you can focus well on issue you actually need to work on.

But, yes its all about personal preferences and I respect your preferences. In my case I would actually let some one else do the job I don't necessarily have to do and use the same time to do well in my actual job.

ssmoot · on Nov 18, 2011

Instead of 12 physical cores, 96GB of RAM and a 2TB SSD Array pushing 1M IOPs on dedicated hardware for my PostgreSQL database servers, I'd need 1TB of RAM in an AWS box because I'll be lucky if I can even break 10K in IOPs.

Does the price make sense then?

I have yet to see any significant AWS deployment that doesn't feel like it could be done better, more reliably, and much more cheaply as a co-located setup.

reuser · on Nov 18, 2011

You can't really create a massive co-located setup on demand for big jobs, then tear it down ... use the right tools for whatever job you're doing. EC2 being more expensive for your Postgres deploy doesn't mean it is an expensive toy

ssmoot · on Nov 18, 2011

I don't think I could say it better: http://twitter.com/#!/dhh/status/133656040561577984

If you really have a need to setup and teardown a bunch of cores for the occassional large batch, nobody's questioning that the cloud is an economical way to do that.

But I've never had to do that. Not in a way that would make the development time involved economical anyway. Wait two hours for this once-in-a-blue-moon processing job to finish, or spend a day setting up a process to handle such jobs quickly in the future?

It'd really take something exceptional (again, with low IOPs demands) to have that make much sense unless I was already hosting in the cloud and had invested money in making such a task quick and cheap.

nupark2 · on Nov 18, 2011

Instead of 12 physical cores, 96GB of RAM and a 2TB SSD Array pushing 1M IOPs on dedicated hardware for my PostgreSQL database servers, I'd need 1TB of RAM in an AWS box because I'll be lucky if I can even break 10K in IOPs.

Does the price make sense then?

Does it? That's the point of analysis. It may be that you can scale up your organization on AWS, and then make the decision to scale something like a database vertically once you actually need it.

The choice should be based on a rational cost analysis, however. Not just money, but developer time, productivity, and a potential loss of focus on the core competency -- which should be your product, not your commodity infrastructure.

I have yet to see any significant AWS deployment that doesn't feel like it could be done better, more reliably, and much more cheaply as a co-located setup.

That's a stretch. Especially the "cheaply" part. The operational costs involved in building and maintaining a significant co-located deployment are huge, not to mention the capital expenditure involved in enterprise networking hardware, servers, cages, racks, PDUs, etc.

I don't firmly fall on either side of the debate -- one must balance the requirements and costs, like anything else.

However, I do firmly believe that we should have programmers automating the entire software system administration job away, leaving only the question of hardware provisioning. That's why we have "devops" style teams nowdays, and I only expect that trend to grow.

ssmoot · on Nov 18, 2011

> Does it?

No, it doesn't. You're throwing up a spectre. This are pretty easy numbers to come by. The point in this one example being vertical scaling options are pretty constrained on something designed to really only effectively scale cores and RAM.

> That's a stretch. Especially the "cheaply" part.

All this is a straw man. I really have a hard time believing "I don't firmly fall on either side of the debate". In fact, I call BS.

Who actually has to wire their own PDUs unless they want to? Or is forced to buy cages, racks, etc?

If you want to I suppose you can, but I haven't hit a Tier 1 data-center where that's even an option. Unless you were to buy an unfurnished cage perhaps on the terms you got to hire your own people for the build out.

Otherwise, for most deployments, even for something like Reddit, you're talking about stacking a couple 48 port switches, racking a few servers, and making sure you don't screw up airflow with bad cabling. Just spend $500 on an experienced cable guy to wire it up. Cabling isn't the most fun.

Staffing costs are a drop in the bucket. Racking a couple dozen systems, configuring your ports is pretty trivial.

If you have a business with dependable positive cash flow affording you the luxury of signing a three year hardware lease, it doesn't make cash sense to do anything else for 99% of deployments.

The "core competency" stuff is just a salve for developers who want to live in a homogenous environment. Life gets easier with a competent IT person. Not harder. You still have to monitor processes, setup syslog servers, archive logs, monitor disk space, load, available RAM, bandwidth, trace down abusers, setup mail servers for at least internal monitoring, create backup procedures, automate deployments, figure out how to compile some old library.

All of this stuff is where your IT staff effort goes. Not in the once-in-a-blue-moon-we-have-to-rack-some-servers tasks. You aren't better off because Amazon is handling your power requirements instead of a good colo. Those things are details you never have to think about either way. Saying so is a definite straw-man.

asharp · on Nov 18, 2011

Saying that you can't get IOPS on AWS is like saying that your machinegun sucks at grating cheese. It's really not meant to do that.

But that's fine as AWS isn't the only cloud platform out there, and some are actually desgined to take data intensive workloads.

ericd · on Nov 18, 2011

The problem is that it's touted as a great way to scale web apps. Since DB performance is usually the limiting factor in webapp scaling, this doesn't appear to stack up. Needing a lot of IOPS is pretty par for the course.

asharp · on Nov 19, 2011

For certain workloads it deals ok, but yeah, I'd fully agree with you.

On the other hand, there are plenty of clouds out there that have reasonable IOP performance, so it's not as though it's a monopoly.

ericd · on Nov 20, 2011

I guess my question is why go for reasonable when you can get incredible performance per dollar with RAID-10 SSDs? A few dedicated DB machines with SSDs and many cores can get you absolutely monstrous throughput without going for any of the more exotic DB solutions.

ssmoot · on Nov 18, 2011

Exactly. And it's not like it's a good idea to put your DB in an IOPs enhanced cloud, and your applications in another.