Autoscaling on AWS

mathrawka · on June 21, 2013

I have found that using a ubuntu base AMI and salt (http://saltstack.com/community.html) makes things much easier to manage. His suggestion of having node.js and nginx baked into the AMI is a big pain when you need to upgrade node.js or nginx (hello security update!). Using something like salt (or puppet or chef) would make this much easier.

But in general, I agree with the article.

rschmitty · on June 21, 2013

Yes it is a pain to bake your AMI that much, but the upside is you autoscale faster. When you can scale faster you can set your thresholds higher because you know a new instance will be ready to serve traffic through ELB faster.

Granted node and nginx don't take long to run an update on. But the more you keep piling on updates the slower your true "time to ready" increases. And that can take a while from ec2 launch to ELB registration.

Why not have best of both and use puppet/chef to create your AMI anytime they require a security update?

mqzaidi · on June 21, 2013

We enable auto applying security updates in the AMI, and the first thing the provisioning script does is update packages. That said, I would definitely want to explore alternatives - not needing to update while provisioning will take a few seconds off the provisioning time.

mnutt · on June 21, 2013

What if you rebuilt the AMI every time you deployed?

crb · on June 21, 2013

That's what Netflix do: http://techblog.netflix.com/2013/03/ami-creation-with-aminat...

spulec · on June 21, 2013

Agreed. We've had success using a very base ubuntu ami and then having the user data install chef and do a chef run.

This results in slower startups for new machines, but the flexibility seems to be worth it. Eventually we'll get to a point where the system will just build a new ami anytime a cookbook change is committed.

damian2000 · on June 21, 2013

I'm probably not understanding something but couldn't you use Puppet or Chef for your manual provisioning tasks?

thejosh · on June 21, 2013

What CSS/Framework/whatever are you using for that metro-style dashboard?

mqzaidi · on June 21, 2013

Thats dashing from shopify. Its really awesome, took me only an hour or so to setup with the integrations. Here's the original https://github.com/Shopify/dashing.

I am using the nodejs port though - npm install dashing-js

shimms · on June 21, 2013

Looks like Dashing from Shopify:

http://shopify.github.io/dashing/

robotmay · on June 21, 2013

Dashing is a great thing to play around with in the evening. I've got a bunch of custom data sources I should really open source for other people, like Heroku dyno counts etc.

contingencies · on June 21, 2013

Not to detract from this article explicitly on the subject of AWS, but ... if you are relying on any single cloud provider then you are probably not going to scale beyond a certain point, since it means that reliability (ie. heterogeneity - re: cloud providers, physical/logical points of presence, legal jurisdictions, etc.) is not all that important to you. At a certain point, a cloud provider neutral approach is called for.

morganK · on June 21, 2013

Not that simple. Having only one cloud provider allow you to make the most of it.

Once you have two providers, you are forced to only use the features offered by each of them and use an abstract layer (API agnostic dev).

Plus, AWS offers a wide range of strategies to ensure the availability of its infrastructure (Availability Zone, Regions, CDN etc.)

contingencies · on June 21, 2013

Once you have two providers, you are forced to only use the features offered by each of them and use an abstract layer (API agnostic dev).

Abstraction layers don't necessarily need to produce lowest common denominator results.

Plus, AWS offers a wide range of strategies to ensure the availability of its infrastructure (Availability Zone, Regions, CDN etc.)

You still wind up vulnerable to quirks of the single provider though! For example, account freeze for whatever reason (financial quirks, legal issues, regulatory change) any multi-site failures (eg. financial, operational, legal, security) at that provider.

randallsquared · on June 21, 2013

> Abstraction layers don't necessarily need to produce lowest common denominator results.

Except in the special case that you can build the functionality not provided by using other functionality that is, I think they do, since otherwise it's leaking. A leaky abstraction is often worse than none.

contingencies · on June 22, 2013

I think your assumption is that cloud providers with different features cannot be abstracted without making the whole abstraction 'leaky'.

While your perspective may hold for a traditional, rigid, single-layer, abstraction layer with the most simplistic, binary-level feature presence, it does not hold for better formed solutions. From https://en.wikipedia.org/wiki/Abstraction_layer: All problems in computer science can be solved by another level of indirection (David Wheeler).

Some real world differences between cloud providers: available hardware, available installation images (OS images), available bandwidth, available logical location on the internet, available physical location (legal jurisdiction, etc.), scale-out time, cost model.

How do you conceive of these differences, and then deploy arbitrary services to an arbitrary number of individual instances running unique combinations of cloud provider specific features on unique cloud-provider deployed OS images in parallel? There are various approaches, but it's not that hard to come up with a functional set of abstractions. Think about it.

pestaa · on June 21, 2013

I'd like to hear more about this. When buying or renting physical machines is worthwhile, how would I actually go about it? Should I get them preassemblied, should I buy parts and then glue it together, or people just generally rent their infrastructure?

jahewson · on June 21, 2013

If your workload is totally standard then renting dedicated servers can make sense, it's certainly flexible if you expect to scale on a monthly basis. However, if you need something "premium" like an SSD or lots of RAM then you can end up paying $300/month. Buying a used server off eBay for $400, putting a new HDD in it, and getting a colocation for $100/month will be the most cost-efficient for those on a tight budget. The provider will usually do basic maintainance like replacing HDDs for a small fee.

That said, if money is not a problem then go with whatever lets you get stuff done with the least pain.

x0x0 · on June 21, 2013

at a certain point, real hardware is called for, if you wish to scale. my experience: 1/3 the price, 2x the performance in softlayer vs a $97k/mo ec2 spend. Obviously specific to my workload, but you should at least benchmark on real hardware.

contingencies · on June 21, 2013

Agreed.

However, cloud does have benefits that you lose with physical, eg. speed of initial deployment, speed of scale-out (no ordering/assembly/wiring/burn-in test/etc. lead times), maintenance and data center proximity issues (often translating to increased costs), masking of some of the legal/financial issues of multi-site presence (eg. if you want multi-jurisdictional points of presence), etc.

There are of course ways to use mixed strategies to combine cloud/third party services/CDN with physical infrastructure.

As always, "right tool for the job". I just felt it was worth mentioning there are limitations with single-provider cloud infrastructures as it's possible to get excited about the benefits of a particular approach without considering the full picture.

imperialWicket · on June 21, 2013

Great points. Something like logstash for logging (which supports graphite/statsd type output) is a nice addition to the logging setup.

Also, for some apps deploying code during instance launch can be time consuming. Depending on your monitors and auto scale config, this might be acceptable. For some it may be appropriate to put code right in the ami and only ever deploy via new ami.

seldo · on June 21, 2013

This is good as far as it goes -- nice to see further corroboration that EBS is not always the best solution.

What I'd really like to see is details of how the traffic monitoring works, how you calculate how much extra capacity to spin up, and how quickly you can react to load events.

CoffeeOnWrite · on June 21, 2013

It's unhelpful to needlessly abbreviate ordinary words, like "infra" for "infrastructure" in the first paragraph.

mqzaidi · on June 21, 2013

You are right - committed a fix, but looks like github is taking longer to rebuild the pages.

dmourati · on June 21, 2013

Lost me at NFS.

bdunbar · on June 21, 2013

How does your organization mount volumes found on another host?

dmourati · on June 21, 2013

S3/EBS at the moment. MogileFS/SMB in the past though mogile and S3 are more properly referred to as object stores.

bdunbar · on June 21, 2013

Ah. I'm new to AWS, had not yet explored 'storage' options.

I guess I'd been thinking of AWS as being 'vmware' but more so, without thinking hard about how it could be different.