Hacker News new | past | comments | ask | show | jobs | submit login
How I served 100k users without breaking the server or a dollar bill (alastair.is)
79 points by untog on April 26, 2013 | hide | past | favorite | 39 comments



I really like Amazon's S3 and similar, I just wish Amazon allowed me to set a cap on my account.

I'm aware of billing alerts, and they're quite useful. But I am really paranoid one of my properties is going to get linked to on Reddit, Digg, or similar and then I'll wake up to a $5K bill which I can ill afford.


My original, unedited post title (grumble grumble) sort of covered that- I served content to 100k users, but it cost me less than a dollar. S3 is amazingly cheap.


Why did the submission title get changed from "How I served 100k users without breaking the server- or a dollar bill." to "Hosting a Website on S3"?


The moderators did it.


The moderators were heavy-handed, in my opinion. Original title was great. It wasn't misleading or link-baity. Not sure why it was changed.


My guess would be because this has already been discussed many times on HN so they felt it was better to just call it what it was.

I'm not saying I agree, just making an observation.


Not going to disagree (because, really, it isn't worth it) but I thought my original title brought a little more emphasis to how cheap S3 is- as the OP demonstrated, I'm not sure that's so widely known.


If you have a completely static website (like a blog or portfolio), instead of having an API server on EC2 or Heroku or wherever, you can host your site entirely on Amazon S3. Plus, you can also choose to use CloudFront, which would reduce latency for your global visitors.

I have a tutorial on using Jekyll, the static site generator that I'm sure a lot of you are familiar with: http://learn.andrewmunsell.com/learn/jekyll-by-example/

I also go over how to use Amazon S3 and Dropbox for hosting, since Jekyll is pretty versatile (it just spits out HTML).


A friend and I have both recently started familiarizing ourselves with Jekyll, and your guide has been fantastic. Thanks for making it.


I'm glad it's been helpful!


I was just going to say that I thought that there were implications for mapping the "www" subdomain like this, because S3 didn't support root domains, but that's been changed in the last few months:

http://aws.typepad.com/aws/2012/12/root-domain-website-hosti...

(and plus, there's redirection)

edit: There are consequences relating to mx records, such as if you want mail service on the domain. From a comment:

> Just be careful here with doing CNAMEs on root domains. Things like email will break because the MX records are no longer visible behind the CNAME. Gmail for example won't send emails to domains with a CNAME as root.

Related article: NPR's apps team had a nice post about how most of their projects are S3 hosted flat files: http://blog.apps.npr.org/2013/02/14/app-template-redux.html


I don't believe using Amazon's DNS like this will result in a "CNAME on the root domain" (a technically invalid situation in all cases, since a CNAME can't co-exist with other records...), I think it causes Amazon to return an A record which has been computed by being internally aliased within Amazon's DNS system.

Edit: Amazon has actually clarified this point in the comments of the article as well -

‘You're completely right about CNAMEs at the domain apex, they do tend to break MX and NS records. When this feature is used with Route 53 ALIAS records, no CNAME will be present - behind the scenes the ALIAS record links directly to the S3 "A" record data.’


It's worth noting that if you want to fully leverage the scalability of S3, using your domain registrar's HTTP 302 service (using a "URL Redirect" feature at your domain registrar) is likely a bad idea.

ismytwitterpasswordsecure.com resolves to 192.31.186.144, which is some unknown HTTP server at Namecheap, meanwhile www.ismytwitterpasswordsecure.com resolves to a CNAME that causes the browser to directly hit Amazon.

As mentioned by another commenter, the "proper" solution to this which doesn't depend on Namecheap's unknown quality HTTP server serving up redirects, is to use Amazon's DNS hosting and their proprietary aliasing solution: http://aws.typepad.com/aws/2012/12/root-domain-website-hosti...


TL;DR: use Amz S3 to host static content for cheap


I'll agree that its simplified, but it almost might be the spark that gets me to use that free account that's dwindling away. Seems pretty simple. Setup static site/DNS/turn off listing in bucket.

Amazon products create insta-fear in me. It seems very overwhelming. I usually just fall back on DigitalOcean/Linode whenever I consider using S3.


I used to feel the insta-fear. All those EC2 configuration options, not to mention LBE, and separate services for everything I could manually set up.

But S3 is simple. Big buckets, tons of files in flat order. No magic involved. I've been very happy with it and you can still take your computing needs elsewhere.


Ha, thought i was the only one who felt AWS insta-fear. Maybe i should try this just as a learning experiment.


I'm hardly technically competent, and S3 has served me very well for static files. I've dabbled around with their other products to learn a bit here and there and have definitely been overwhelmed, but S3 is really really simple.


I was going to include my subtitle, but the HN title limit got the better of me. It is a little bit of a clickbait headline, I'll give you that. Just no-one tell HuffPo Spoilers: https://twitter.com/huffpospoilers

(side note: this blog post isn't hosted on S3. If it goes down, I will eat a big slice of humble pie)

(side note #2: the title has since changed to make this post make no sense whatsoever)


In addition to being cheap it is reliable. I haven't had any downtime on my sites since switching them over to s3.


I've been afraid to use S3 like this because S3 requests sometimes fail and the client needs to retry after a short delay. S3 libraries do retries under the hood, but web browsers don't - instead the user sees an error message or a broken page. This probably wouldn't happen much on a low-traffic site, but a site getting as many hits as this guy's probably has problems for a non-trivial number of users.


Well, you could always use CloudFront as well, which would prevent these errors from ever appearing to your website's visitors.


I had read that, and was expecting that (for a silly idea like this I don't really care), but I never received any feedback from anyone that received an error.


Do you have any data on how much is 'non-trivial'? (Not disbelieving you, I'm wondering whether it's trivial for my use case.)


Here's a HN comment I bookmarked about this: https://news.ycombinator.com/item?id=4976893

That guy was reporting a 2% error rate from one S3 server, but there has been additional discussion on that thread which suggests that error rate is an aberration. Personally, I have no hard data, just the anecdote that when I started developing with S3 I wasn't using retries and that became a problem after approximately a few thousand queries. Unfortunately I've never seen Amazon say what error rate to expect, just that you should be sure to retry on error. I think I'll start logging how often I need to retry so I can get some hard data.

Keep in mind that when you're hosting a site on S3, you have to multiply the error rate by the number of resources you're serving from S3 to get the probability an individual user will have problems. www.ismytwitterpasswordsecure.com serves 4 resources from S3. Assuming a 0.1% error rate, 400 out of the 100,000 users would have had problems.

As another commenter has suggested, you can serve from Cloudfront instead of directly from S3. That will significantly reduce the number of hits which are made to S3.


Amazon has a ton of guides on common use cases for their services, one of which is hosting a static website on AWS using Route 53 to handle DNS, S3 for managing files and requests and CloudFront as your CDN.

http://docs.aws.amazon.com/gettingstarted/latest/swh/website...


I'd like to host a static site on S3 with SSL. But AIUI this isn't possible with S3/CloudFront alone because the hostname in the SSL certificate won't match the custom CNAME.

What do most folks do in this situation? Put CloudFlare or Fastly in front of S3?


You know any way I could access databases through a static page? I mean, I wanted to have an "open to read, but not write" database to feed contents to a static HTML through javascript, would be neat.


I covered it very briefly in the post, but could/should have gone into more detail. You can store all of your static HTML/CSS/JS on an S3 instance, then have a separate EC2 (or whatever) to do the database stuff.

Create it as an API that's reachable through JSONP or (better) CORS. It won't stop you having scaling issues, but it'll help to reduce the number of requests hammering your database. I've recently spent an unhealthy amount of time setting up Varnish to cache dynamic content, so a post on that topic will be coming down the pipeline soon.


Ajax requests are what you're after. Store your data in JSON or plain text files and use your favourite JavaScript framework to retrieve them client-side.


Keep a database on your computer, write a quick script that dumps it to json format, and uploads it S3 automatically, and have the javascript hit that file. Run the script every time you change the data.


Look into Firebase, https://www.firebase.com/ .They have a js api to work straight from the browser.


You should be able to use JSONP regardless of the data source to accomplish that.


Does anyone have any commentary on using S3 with analytics software (Google or others)?

The process of creating a specific 'www.example.com' bucket just to do the redirects seems to break my analytics software (Piwik) and as a result, it doesn't track referrers properly. That is, anyone who visits by a www.example.com link will have no referrer and just say "Direct Entry" since they hit the redirect then arrived at example.com


Sure, I'm going to submit my twitter password to some random website (not even served over SSL). Nice try. So that guy now has 100k twitter passwords?


100K users can't be wrong. I tried it a couple days ago and it worked, as I have not been hacked since. You should just at least try it.


honestly, just make up some fake info to see why everyone enjoyed the site.


Try typing something (anything!) into the form on the site and you will understand.


Read the blog post and visit the site. He's not storing the passwords.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: