Hacker News new | past | comments | ask | show | jobs | submit login
Autoscaling on AWS (qzaidi.github.io)
68 points by mqzaidi on June 21, 2013 | hide | past | favorite | 28 comments



I have found that using a ubuntu base AMI and salt (http://saltstack.com/community.html) makes things much easier to manage. His suggestion of having node.js and nginx baked into the AMI is a big pain when you need to upgrade node.js or nginx (hello security update!). Using something like salt (or puppet or chef) would make this much easier.

But in general, I agree with the article.


Yes it is a pain to bake your AMI that much, but the upside is you autoscale faster. When you can scale faster you can set your thresholds higher because you know a new instance will be ready to serve traffic through ELB faster.

Granted node and nginx don't take long to run an update on. But the more you keep piling on updates the slower your true "time to ready" increases. And that can take a while from ec2 launch to ELB registration.

Why not have best of both and use puppet/chef to create your AMI anytime they require a security update?


We enable auto applying security updates in the AMI, and the first thing the provisioning script does is update packages. That said, I would definitely want to explore alternatives - not needing to update while provisioning will take a few seconds off the provisioning time.


What if you rebuilt the AMI every time you deployed?



Agreed. We've had success using a very base ubuntu ami and then having the user data install chef and do a chef run.

This results in slower startups for new machines, but the flexibility seems to be worth it. Eventually we'll get to a point where the system will just build a new ami anytime a cookbook change is committed.


I'm probably not understanding something but couldn't you use Puppet or Chef for your manual provisioning tasks?


What CSS/Framework/whatever are you using for that metro-style dashboard?


Thats dashing from shopify. Its really awesome, took me only an hour or so to setup with the integrations. Here's the original https://github.com/Shopify/dashing.

I am using the nodejs port though - npm install dashing-js


Looks like Dashing from Shopify:

http://shopify.github.io/dashing/


Dashing is a great thing to play around with in the evening. I've got a bunch of custom data sources I should really open source for other people, like Heroku dyno counts etc.


Not to detract from this article explicitly on the subject of AWS, but ... if you are relying on any single cloud provider then you are probably not going to scale beyond a certain point, since it means that reliability (ie. heterogeneity - re: cloud providers, physical/logical points of presence, legal jurisdictions, etc.) is not all that important to you. At a certain point, a cloud provider neutral approach is called for.


Not that simple. Having only one cloud provider allow you to make the most of it.

Once you have two providers, you are forced to only use the features offered by each of them and use an abstract layer (API agnostic dev).

Plus, AWS offers a wide range of strategies to ensure the availability of its infrastructure (Availability Zone, Regions, CDN etc.)


Once you have two providers, you are forced to only use the features offered by each of them and use an abstract layer (API agnostic dev).

Abstraction layers don't necessarily need to produce lowest common denominator results.

Plus, AWS offers a wide range of strategies to ensure the availability of its infrastructure (Availability Zone, Regions, CDN etc.)

You still wind up vulnerable to quirks of the single provider though! For example, account freeze for whatever reason (financial quirks, legal issues, regulatory change) any multi-site failures (eg. financial, operational, legal, security) at that provider.


> Abstraction layers don't necessarily need to produce lowest common denominator results.

Except in the special case that you can build the functionality not provided by using other functionality that is, I think they do, since otherwise it's leaking. A leaky abstraction is often worse than none.


I think your assumption is that cloud providers with different features cannot be abstracted without making the whole abstraction 'leaky'.

While your perspective may hold for a traditional, rigid, single-layer, abstraction layer with the most simplistic, binary-level feature presence, it does not hold for better formed solutions. From https://en.wikipedia.org/wiki/Abstraction_layer: All problems in computer science can be solved by another level of indirection (David Wheeler).

Some real world differences between cloud providers: available hardware, available installation images (OS images), available bandwidth, available logical location on the internet, available physical location (legal jurisdiction, etc.), scale-out time, cost model.

How do you conceive of these differences, and then deploy arbitrary services to an arbitrary number of individual instances running unique combinations of cloud provider specific features on unique cloud-provider deployed OS images in parallel? There are various approaches, but it's not that hard to come up with a functional set of abstractions. Think about it.


I'd like to hear more about this. When buying or renting physical machines is worthwhile, how would I actually go about it? Should I get them preassemblied, should I buy parts and then glue it together, or people just generally rent their infrastructure?


If your workload is totally standard then renting dedicated servers can make sense, it's certainly flexible if you expect to scale on a monthly basis. However, if you need something "premium" like an SSD or lots of RAM then you can end up paying $300/month. Buying a used server off eBay for $400, putting a new HDD in it, and getting a colocation for $100/month will be the most cost-efficient for those on a tight budget. The provider will usually do basic maintainance like replacing HDDs for a small fee.

That said, if money is not a problem then go with whatever lets you get stuff done with the least pain.


at a certain point, real hardware is called for, if you wish to scale. my experience: 1/3 the price, 2x the performance in softlayer vs a $97k/mo ec2 spend. Obviously specific to my workload, but you should at least benchmark on real hardware.


Agreed.

However, cloud does have benefits that you lose with physical, eg. speed of initial deployment, speed of scale-out (no ordering/assembly/wiring/burn-in test/etc. lead times), maintenance and data center proximity issues (often translating to increased costs), masking of some of the legal/financial issues of multi-site presence (eg. if you want multi-jurisdictional points of presence), etc.

There are of course ways to use mixed strategies to combine cloud/third party services/CDN with physical infrastructure.

As always, "right tool for the job". I just felt it was worth mentioning there are limitations with single-provider cloud infrastructures as it's possible to get excited about the benefits of a particular approach without considering the full picture.


Great points. Something like logstash for logging (which supports graphite/statsd type output) is a nice addition to the logging setup.

Also, for some apps deploying code during instance launch can be time consuming. Depending on your monitors and auto scale config, this might be acceptable. For some it may be appropriate to put code right in the ami and only ever deploy via new ami.


This is good as far as it goes -- nice to see further corroboration that EBS is not always the best solution.

What I'd really like to see is details of how the traffic monitoring works, how you calculate how much extra capacity to spin up, and how quickly you can react to load events.


It's unhelpful to needlessly abbreviate ordinary words, like "infra" for "infrastructure" in the first paragraph.


You are right - committed a fix, but looks like github is taking longer to rebuild the pages.


Lost me at NFS.


How does your organization mount volumes found on another host?


S3/EBS at the moment. MogileFS/SMB in the past though mogile and S3 are more properly referred to as object stores.


Ah. I'm new to AWS, had not yet explored 'storage' options.

I guess I'd been thinking of AWS as being 'vmware' but more so, without thinking hard about how it could be different.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: