Hacker News new | past | comments | ask | show | jobs | submit login

For everyone saying this is obvious: I'm one of those idiots that didn't quite get it and I appreciate this post a lot.

For the past week, I've been working on writing a little static site generator and putting it on S3 and I thought I was pretty damn clever for finally getting around to doing that... except now it turns out I'm clueless yet again. I'm looking at Cloudfront now, but I'm still not sure if has all the features that I was expected out of S3 alone (someone already mentioned Route 53 integration).




I don't think you're an idiot, it's more likely that you were just unaware of cloudfront or perhaps CDNs in general.

S3 is for storing files, Cloudfront is for serving cached versions of them really quickly out of edge locations (i.e. the closest CDN datacentre to the user that requests it).

In your usecase, your best bet is to use them in combination. Set up a Cloudfront distribution to point to your S3 bucket, then setup DNS to point to your Cloudfront distribution.

One big point to note is that you need to either:

* Configure Cloudfront to expire objects after a TTL (time-to-live) that is reasonable to you (e.g. 1 hour, 1 day etc). You can do this from the Cloudfront 'new distribution' wizard.

OR

* Let Cloudfront respect HTTP headers and then make S3 (or whatever your custom origin is) set cache-control headers that make sense for how often you update your site. Not sure if/how you can do this with S3, with a custom origin its your app so you can set whatever http headers you like.

To be clear: if you don't do this, I'm pretty sure cloudfront caches things forever, or at least a very long time.

Personally, I think triggering cache invalidations should only be for emergencies (e.g. someone has uploaded questionable content to serve to other users and it's cached in at and edge). Rather than screwing around with that, save yourself some headaches: pick a sensible TTL and wait a little longer to have things up to date at your edges.

Note that by using Cloudfront in this manner, you get the performance benefit of serving static files. If performance at the expense of convenience was your main reason for going with static site generation, you might want to rethink that decision (there are other perfectly good reasons for wanting to use a static site generator, security being my favourite).

Do feel free to ping me over email if you have any questions on the above.


>If performance at the expense of convenience was your main reason for going with static site generation, you might want to rethink that decision (there are other perfectly good reasons for wanting to use a static site generator, security being my favourite).

Great comment. I didn't get this part though. What would be a better alternative?

Performance is one of the main reasons I considered this (uptime is another). Let's just say I've had the same shared hosting for over 5 years and the speed/uptime have been a disappointment for a long time. When I was working on a site and noticed a 500kb background image was taking 2 seconds to load and around the same time I saw that the spotify homepage was streaming a fullscreen video instantly, that was kind of the last straw.

So I thought the idea was that skipping dynamic generation, using distribution (well, I think my assumption was S3 did have edge locations), and just having a better host was a big win.


> I didn't get this part though. What would be a better alternative?

To clarify: I'm not saying that static assets behind an edge cache is not performant, just saying that a dynamically generated site behind an edge cache is effectively the same performance wise. It's probably not worth the sacrifice in convenience if performance was your main reason for going the static generated route.

I can't speak to the argument from uptime as I haven't used shared hosting in a while. Using an edge cache (without S3) might give you a little help there, as it only needs to hit your shared hosting on cache expiry, but that obviously won't be as safe as statically generating the files and making CF read them out of S3.

I think S3 behind CF is a perfectly good approach. I was just saying that if you've currently got a dynamically generated site and are considering moving to static generation because of performance alone, the trade-off probably isn't worth the effort.

I wouldn't advise you personally to go back on that decision at all, especially because of the issues you've seen with uptime.

> I think my assumption was S3 did have edge locations

I think you get this from the rest of my comments but just to be clear: I've only ever experienced bad download times from S3 and would not feel comfortable recommending that you use it to serve traffic directly from the internet. It's not what S3 is for and so you shouldn't expect good performance from it in that use case.


I was working on a site and noticed a 500kb background image was taking 2 seconds to load and around the same time I saw that the spotify homepage was streaming a fullscreen video instantly

I don't think that's a fair comparison - it seems unlikely to me that the video requires much data to get started, or a sustained 250kB/s (about 2 megabits/s) connection to play.


You should simply put CloudFront in front of S3.

In that case, you get the benefit of both worlds.


You're not clueless. S3 is almost certainly good enough for what you're doing.


I wrote a static site generator at the beginning of this year:)

https://github.com/jimktrains/gus


@jimktrains2 +1


You should really be using CloudFront and "NOT" S3 if you are looking for a CDN solution. There is a reason why CloudFront exists after all. S3 is a merely a drop-in storage option, consider it as a big file system residing within big fat piped data centers.


Back Fastly with your S3 static site. Works really well.



I think I'm going to go with CloudFront, but I'm pretty upset about not being able to use a naked domain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: