I understand what's happening -- there's no need to explain. Running a dedicated...

cestith · on Aug 13, 2019

You do make some solid points. You assume, however, that there's a bandwidth cost between the load balancer and the backend which won't be true for everyone. You also don't consider the cache behind the load balancer might be much larger and have a much longer TTL than the CDN's cache. Economics of this sort of setup are entirely different if you're putting every piece on a cloud instance rather than having a rack somewhere with your private data flowing for free over your own switch.

dmarby · on Aug 13, 2019

Author of the post here, figured I'd clarify some things since there seem to be some major misconceptions present.

First off, I don't claim to be an expert, I find that a pretty arrogant title for anyone to use. I'd like to think I know a thing or two about building highly scalable webservices however, and of course I'm always open to the opportunity to learn if I'm doing things incorrectly.

That said, Picsum is what I use to play around with new technologies and try new things since it's high-traffic enough that I can get some real data on how things perform. Is it very over-engineered? Absolutely, but that's part of the fun.

When it comes to Picsum, the reason for not pre-processing all the images is that there are simply too many variations with the sizes and variations you can request through the API. For every image, there are 5001 * 5001 * 22 variations that can be requested, and in total, we have just under a thousand source images.

As for running Varnish behind our CDN, this is done for a couple of reasons:

- We can make sure that an image is only processed once simultaneously, even tho the CDN might request it multiple times before it's cache has been filled.

- We can apply optimizations, such as sorting and filtering the query parameters for variations, to achieve a better cache rate. This is not possible to do with the CDN provider we use.

The resources it uses are negligible, the extra latency within the cluster is vanishingly small, and it saves us a lot of extra processing. Every service within the Kubernetes cluster runs at least two replicas, varnish included, for redundancy and to distribute the load. We're not using separate servers for each layer/component, that'd be wasteful.

As for bandwidth costs, there's no cost for the bandwidth between the CDN and the load balancer, as DigitalOcean does not charge for load balancer bandwidth. There's also no cost for anything behind the load balancer, as this is all internal traffic, either within DigitalOcean or within the Kubernetes cluster itself.

Talking about Spaces, I think you might be confused. Spaces is an object storage, which also happens to have optional CDN capability built-in. Picsum only uses the object storage part, for storing the source images that are used for processing. The reason we use Redis to cache said source images is to avoid having to fetch them from Spaces on every request, as this is rather slow comparatively. An important distinction here is that Spaces/Redis stores and caches the source images, not the processed ones, which are cached by Varnish and the CDN.

As an aside, since you seem to think that comparing numbers for services with vastly different needs and usecases is worthwhile, Picsum serves a bit over 8TB of traffic a month, and costs less then your setup to run.