If you wanted to scale this to multiple proxies, without having each one be a duplicate, do a consistent hash of the URL and proxy it to the owner.
We used to do this to cache the first chunk of videos. Encoded with `-movflags faststart` this typically ensured that the moov chunk was cached at an edge, which dramatically decreased the wait time of video playback (supporting arbitrary seeking of large/long video file)
CDNs aren’t intended as a “canonical store”; content can be invalidated from a CDN’s caches at any time, for any reason (e.g. because the CDN replaced one of their disk nodes), and the CDN expects to be able to re-fetch it from the origin. You need to maintain the canonical store yourself — usually in the form of an object store. (Also, because CDNs try to be nearly-stateless, they don’t tend to be built with an architecture capable of fetching one “primary” copy of your canonical-store data and then mirroring it from there; but rather they usually have each CDN node fetch its own copy directly from your origin. That can be expensive for you, if this data is being computed each time it’s fetched!)
Your own HTTP reverse-proxy caching scheme, meanwhile, can be made durable, such that the cache is guaranteed to only re-fetch at explicit controlled intervals. In that sense, it can be the “canonical store”, replacing an object store — at least for the type of data that “expires.”
This provides a very nice pipeline: you can write “reporting” code in your backend, exposed on a regular HTTP route, that does some very expensive computations and then just streams them out as an HTTP response; and then you can put your HTTP reverse-proxy cache in front of that route. As long as the cache is durable, and the caching headers are set correctly, you’ll only actually have the reporting endpoint on the backend re-requested when the previous report expires; so you’ll never do a “redundant” re-computation. And yet you don’t need to write a single scrap of rate-limiting code in the backend itself to protect that endpoint from being used to DDoS your system. It’s inherently protected by the caching.
You get essentially the same semantics as if the backend itself was a worker running a scheduler that triggered the expensive computation and then pushed the result into an object store, which is then fronted by a CDN; but your backend doesn’t need to know anything about scheduling, or object stores, or any of that. It can be completely stateless, just doing some database/upstream-API queries in response to an HTTP request, building a response, and streaming it. It can be a Lambda, or a single non-framework PHP file, or whatever-you-like.
> Also, because CDNs try to be nearly-stateless, they don’t tend to be built with an architecture capable of fetching one “primary” copy of your canonical-store data and then mirroring it from there.
True, though some CDNs let you replicate data up to 25MB in size globally; others support bigger sizes depending on your spends with them.
> And yet you don’t need to write a single scrap of rate-limiting code in the backend itself to protect that endpoint from being used to DDoS your system. It’s inherently protected by the caching.
Systems in steady state aren't what cause extended outages. The recovery phase can also DDoS your systems. For example, when a cache goes cold, it may over-power an underscaled datastore [0].
> You get essentially the same semantics as if the backend itself was a worker running a scheduler that triggered the expensive computation and then pushed the result into an object store, which is then fronted by a CDN.
Agree, but running one's own high-availability infrastructure is hard for a small team. In some applications, even if scale isn't important, availability always mostly is.
> It can be a Lambda, or a single non-framework PHP file, or whatever-you-like.
Agree, a serverless function a CDN runs (Lambda@Edge, Workers, StackPath EdgeEngine etc) would accomplish a similar feat; and that's what we do for our data workloads that front S3 through Cloudflare Workers.
This comment touches on a lot of topics that I’m interested in learning more about but don’t have the vocabulary to google or find resources about. Any chance you could point me at a somewhere that I could read about these patterns in more depth? Handling expensive computations and preventing re-computation on the backend is something I don’t know how to solve effectively because there are several different approaches I can think of (in memory cache on the backend, persist the result and return it the next time it’s asked for, cache the response in front of the service, etc)
Because we invalidate some SSDB keys in various cases, for ex: like old code stats data incoming, renaming a project, etc.
Http proxies with etag seem like a good public-facing cache option, but our caching is internal to our DigitalOcean servers. Here's more info on our infra, which might help with our decision:
Consider using opaque symbols for your object/cache identifiers instead of using meaningful literals. That way, you can simply update the identifier mapping (Project A cache prefix moves from ajsdf09yu8sbvoihjasdg -> klajnsdg9fasf8avby), and the old values will naturally be evicted from the cache as they become less frequently accessed.
In my experience, having to perform cache invalidation is usually a sign of design immaturity. Senior engineers have been through this trial before :-)
nginx has proxy_cache_purge (1), but I agree that nginx doesn't expose a particularly good API around its cache. Plus, the awkwardness of doing disk IO from OpenResty (unless this has been improved), is a drag.
Maybe this is something envoy or the newer breads do better.
Definitely a strong use case for a caching proxy like Varnish or Nginx. The post describes using Redis as an application cache which means your application needs custom logic. A cache lookup before going to S3, and then it needs to write back out to cache when done. Simple, but possibly a lot more work than necessary. A caching proxy does away with this, you make a single call to the proxy, which behaves identically to S3, and it does all the work for you. No application changes needed other than changing the S3 endpoint address. Also, much greater performance because you are potentially cutting out multiple costly round trips.
ETags are not cache invalidation. ETags are a way for a client to validate the cache in the client, while cache invalidation for a http proxy would involve telling the proxy that it should discard a cached file.
---
Basically:
cache validation: Check if a file matches a cached version
cache invalidation: Preemptively tell a cache to discard a cached file
---
For example:
Client A requests file X. Proxy caches file X.
File X changes in the upstream.
Client B requests file X. Proxy does not know the file changed upstream, so it sends a stale version.
---
In this case either the proxy needs to revalidate each request with upstream (which is expensive) or we need some way to tell the proxy that it should discard it's cache for file X since it has changed.
I think I'll have to look into this to see if we can get away with document/blob cache in one of our solutions actually. I've resisted add-in Varnish in the mix, but it might be time to reconsider.
You should really just foot the bill for AWS. This is going to be technical debt for you later on that will need to be paid. As a software/systems engineer I'll walk away from anywhere doing stuff like this to save a few dollars. I've worked a lot of janky places and now I work somewhere that goes the extra mile and puts that little bit extra in to do things right the first time. I know the on-going costs of doing stuff like this as well as the fact that you will have to clean it up at some point.
EDIT: Not necessarily YOU will have to clean it up at some point, but the next guy most likely will.
It's just me nobody else is working on WakaTime, so it's ok. Also, S3 scales beautifully and has been the best database decision ever. Built-in reliability and redundancy.
SSDB has also been wonderful. It's just as easy and powerful as Redis, without the RAM limitation. I've had zero issues with SSDB regarding maintenance and reliability, as long as you increase your ulimit [1] and run ssdb-cli compact periodically [2].
I've tried dual-writing in production to many databases including Cassandra, RethinkDB, CockroachDB, TimescaleDB, and more. So far, this setup IS the right solution for this problem.
[2] ssdb-cli compact runs garbage collection and needs to be run every day/week/month depending on your writes load, or you'll eventually run out of disk space. Check the blog post for my crontab automating ssdb-cli compact.
I don't think there's anything wrong with this. The correct answer is not always "outsource to AWS"; dependence on AWS (and any external dependency, especially those you have to pay for) is also tech debt. Everything you do is tech debt, and since this is just you, building it yourself is not only respectable (here, have an upvote!) but could also perhaps be something you end up doing 2-3 years down the line, and you've just gotten ahead of yourself.
For many startups, that code that you mentioned as technical debt will be deleted en masse as the business pivots and adapts.
For startups, moving fast and getting 80% done in 20% of the time is the correct choice. It's frustrating to look at the result as a software developer, but the engineer is there to serve the business, which pays the salary. As a software engineer, I very much disliked working in engineering teams that were engineering for the sake of engineering. It quickly devolves into meaningless arguments and drama.
Im in the process of purging data off s3. I suspect it will be about two weeks of work and I calculate it will save me roughly 250k over the "lifetime" of my product.
What I think too few people understand about s3 is that costs on S3 compound. Every month you not only pay for data you stored in the current month but you also pay for the data stored in every month previously.
S3 is a de facto standard supported by multiple vendors and open source technology. You can pick lots of other cloud vendors and use S3 there, or you can self-host S3. It's one of many AWS technologies that is not vendor-locked.
Yep, for ex: we use the exact same code to get/put files to DigitalOcean Spaces... the only thing that changes is a config value for the S3 endpoint. S3 is basically an open standard. The only vendor lock-in would be the time/money it takes to copy your data out of S3, but that's the same for anywhere you store your data... even when self-hosting.
At around, say, 10k requests/sec, by far the dominant cost is the s3 api cost for GET requests at $.0004 per 1000 calls. Having a cache saves millions of dollars per month in this scenario, and if scaling is expected, then planning ahead like this is a good idea.
Wish they would have given an idea of how much this helped improve their AWS bill. Hard to judge the value of this implementation without knowing the $$$ saved.
How is $200/mo savings worth spending more than like 10 seconds of engineering time on? If you have to support the solution at all you wipe out all the savings almost instantly.
We're a one-man shop, so the cost savings made sense in the long run. I already use SSDB and Redis in production so I'm very familiar with supporting it.
Generally good thinking, but few "real world" notes
A) you first need to run the service to get any customers
B) this might take long
C) you maybe don't want / can't get VC money at this stage
D) you maybe are not the most advanced dev who can properly utilitize s3 from I/O perspective, getting you to higher costs than possible
E) there might be a time period between introducing the service and getting traction which yields enough feedback, so you can start adding more business features
F) when you are burning your own money, you are more senstive to the cost side - which is not ultimately wrong
No, pengaru's right. They're not mutually exclusive. For ex: making the site faster and more stable means I spend less time fixing and keeping things running and more time building new features.
Nah. As someone who has gone from a 1 person project to ~10 people and funding, taking the time to tackle problems like this is critical - it frees up an order of magnitude of time later, helps make the product experience smoother, saves you real money that, when you're one person, really can make a difference.
Which could have been gained by.... hosting on AWS.
Which ends up being back to - purely cost optimization of the initial cost optimization.
It seems like the whole product is focusing on cost optimization. I suspect that the OP would be better off making money by doing cost optimizations for third parties.
The economics don't work out on that... we migrated away from our $3k/mo AWS bill back in the day to save $2k/mo with DO. Going back to EC2 would save $200/mo but cost $2,000/mo in compute.
This is the problem with a lot of the dev ecosystem. Many of the tool-makers don't understand the concept of a one-man shop, so scale to/from zero is never given the first class priority it deserves
You can look at this as time wasted on premature optimization. That time could have been spent on building valuable functionality, that would actually grow the business.
How much experience bootstrapping a product do you have? It's not always a choice between spending time optimizing vs. new features. Sometimes you have time for both. Sometimes optimization means a more stable product that frees up your time for the new features.
I started a few projects from scratch... with stricter financial restrictions as well.
Splitting up the storage and execution between cloud providers was clearly an early cost optimization.
Then this cost optimization was to fix with the result of the previous cost optimization.
Adding more code that has no clear end value to your service* never makes anything more stable. Now OP has to maintain three things, instead of two. It's a classic "look at how smart I am" overengineering.
*- OPs customers definitely give 0 f's if everything is on AWS or split between AWS and DO.
I have multiple small side projects running, and cannot afford to have an expense of more than a few dollars each per month. It’s a fun project to try and reduce costs as long as your priorities are straight.
I remember seeing your comment on that previous post. I enjoy seeing those sorts of "behind the curtain" details that break down stack & cost of applicaitons.
I'm curious if you've tested what it would cost to just host on EC2 (or something potentially even cheaper in AWS like ECS Fargate) with a savings plan. At a glance, it looks like AWS would be cheaper than DO if you can commit to 1-yr reserved instances.
That would seem like an easier (and possibly more effective) way to get around costly AWS outbound data costs compared to running a separate cache and sending data across the internet between cloud providers just to save $200/mo.
Back in the day I did use EC2, but to get the same performance (especially SSD IOPs) it cost a lot more than DigitalOcean. Back then, the monthly bill for EC2 was over $3,000/mo. Switching to DigitalOcean we got better performance for under $1k/mo.
Hadn't heard of MinIO. We looked into another S3 caching solution that someone linked in comments here, but it came down to we already used SSDB in production. Also, a personal preference of mine is to keep all logic in the same layer: Python.
For ex: We use SQLAlchemy as the source of truth for our relational database schema, and we use Alembic to manage database schema changes in Python. We even shard our Postgres tables in Python [1]. It fits with my preference to keep this caching logic also in Python, by reading/writing directly to the cache and S3 from the Python app and workers.
Very cool! Thanks for sharing, I hadn't seen this before. Edit: Actually I looked at this before using SSDB and it was overly complicated for my use case.
We went with SSDB because we already used Redis and SSDB, were very familiar with them, and had already worked out any pitfalls with using them. Wish we had found this sooner, thanks!
Hadn't heard of SSDB before, looks promising, active, variety of users, broad client support.
Off the top of my head, if I were trying to front AWS S3 from DigitalOcean, I'd have gone with MinIO and their AWS S3 gateway[1]. It appears to be purpose built for this exact kind of problem.
You might face a language barrier when trying to find answers from the community: For example it looks like most of the discussion on the issues and pull requests are in Chinese. I do not believe Google Translate does well with technical terms.
(Of course this is a concern only if you don't read Chinese.)
Last I checked, DigitalOcean has a "Spaces" feature for file storage, which is compatible with the AWS S3 API (and they even suggest using the AWS client library).
I always found it odd that we can easily port apps, databases, etc., from one cloud to another, but for file storage/CDN it's always some proprietary solution like S3. AFAIK open source solutions never really took off.
Just use a CDN like CloudFront or Cloudflare. S3 is for storage, not retrieval. Building your own poorly distributed not at edge psuedo-CDN won't help as using a real edge CDN. Edge CDNs are closer.
Aren’t both of the problems posed solvable with stock redis? Using an append only file for persistence and enabling virtual memory support so the dataset can grow beyond RAM size.
What do you mean by deprecated? As far as I know, Swap partition/files are alive and kicking in Linux. There are even recommendations of the amount of swap you should setup depending on the available RAM.
How would SSDB compare to using etcd? They both are disk backed key-value stores, but SSDB doesn't seem to be under active maintenance, with most of the recent merges being about fixing build issues.
The main difference is you can use existing code/libraries without changing anything if you're already using Redis. That was a big win for us, since we could just swap out the Redis instances that suffered from low hit rates (because the data didn't fit in RAM and was purged) for SSDB easily.
This article seems to take it on Redis as it unproperly supports this use case, but Redis does not seem to be a right choice for a cheap vertically scalable k-v store. I didn't know SSSD but really you could have chosen amongst many other k-v stores or even memcache with disk support.
I guess the point is that someone chose Redis and soon realized that RAM was not going to be the cheapest store for 500Gb of data. But that is not Redis' fault.
There was an analysis done on the Redis API in Scylla back in 2019 (https://siddharthc.medium.com/redis-on-nvme-with-scylladb-5e...). The implementation, while still experimental, has come a long way since. I would be very interested to hear additional feedback from Redis users.
Not getting how this helps. How do they know if the data is stale in SSDB? If the data is immutable I get that caching it locally speeds things up, but then couldn't they merely mirror files and check if the file exists?
The majority of code stats (the data being stored) is received for the current day. Older data still comes in, but in that case we just update the SSDB cache after writing the new data to S3.
The reason we don't use SSDB as the main source of truth is because S3 provides replication and resilience.
Sure, I meant a mirror is just a cache for data that doesn't change. I edited the original comment, because it doesn't help explain why we used SSDB here.
Hm. If a disk-based redis clone is actually more performant than redis... what are the reasons to use redis instead of SSDB? Why is anyone using redis instead of SSDB? Should I stop?
Because files are hard, a lot of projects use a file system library called LevelDB/RocksDB [1] They interact with the library instead of directly with files. CockroachDB and SSDB are examples of projects using LevelDB.
Minio is not a caching layer, it's pretty much spinning up your own object storage. I have used it and SSDB, and SSDB is a better drop in replacement for redis. Minio is a drop in for S3 if you don't want to pay for S3 and not serving a large enough data.
Yup, nothing will match the redundancy and reliability of S3. If you use Minio and have hard drive issue, you now need to restore. If you really want to cut down on S3 cost some mor then you might want to use Wasabi or Backblaze with Minio as the S3 layer in front. If you're profitable tho, the peace of mind of S3 is worth it than trying to save another $100.
Yep, it's true SSDB is a hidden gem that deserves more exposure. LevelDB/RocksDB has been around for a while with benchmarks. Here's a related reply about it from earlier https://news.ycombinator.com/item?id=26957339
Since they are using Redis as a simple key-value store, I wonder if they looked into something like LMDB which is disk-backed and has even better performance?
I don't know Redis' design, but all sorts of things can slow down a program's reading from memory, from CPU cache misses to mutexes, garbage collection, CPU utilization, or something else (I don't know Redis' design). Whereas the page cache and disk cache can be very efficient, without anything but more reads/writes to get in your way. Or maybe the program reading from memory is single-threaded, and the program reading from disk isn't.
The magic of Google's LevelDB? Seriously though, I don't know and I'm not sure if it's an accurate benchmark. I just know our statsd metrics show it performing just as fast as official Redis on much larger data sets, so that benchmark is probably correct.
you are decreasing your bill at the cost of your reliability. You are moving from offering the reliability of S3 to a single node reliability of your reverse proxy.
Maybe if u have the technological knowledge, start your own block storage using Rook/Ceph object storage on DO. This will reduce your bill even further, and if u know what u r doing, u can improve the reliability
No, the SSDB cache can go offline and the app just reads from S3 directly. I appreciate everyone trying to suggest better ways to solve this, but know that I've gone through many solutions dual-writing in production, comparing latency and throughput on production loads, and this is the best so far.
That's kind of what we're doing, until we start using SSDB Clustering. Filesystem as cache would be good for a local same-server cache, but it doesn't scale well when used over the network.
I want to hate AWS as much as the next guy, but this is nearly the dumbest thing I've seen trending lately. They are paying money for something that is FREE to save a dime.
AWS S3 -> AWS EC2 bandwidth is FREE. To pay money to send the data to DO EC2 and then build a Redis cache to save a few dimes on EC2 that the D.O. Redis cluster now spends is...
Can you please not post shallow dismissals and put other people's work down like this? Putdowns tend to get upvoted, and then they sit at the top of threads, letting off toxic fumes and making this a nasty place. We're trying to avoid that here. (I've downweighted this subthread now.)
I'm not sure why name-calling and meanness attract upvotes the way they do—it seems to be an unfortunate bug in how upvoting systems interact with the brain—but HN members need to realize that posts like this exert a strong conditioning effect on the community.
It's always possible to rephrase a comment like this as, for example, a curious question—you just have to remember that maybe you're not 100% aware of every consideration that went into someone's work.
Totally get that. The journey here wasn't straightforward. I've tried dual-writing in production to many databases including Cassandra, RethinkDB, CockroachDB, TimescaleDB, and more. Haven't tried ClickHouse/VictoriaMetrics, but probably won't now because S3 scales beautifully. The main reason not using EC2 is compute and attached SSD IOPs costs. This balance of DO compute and AWS S3 is the best combination so far.
We used to do this to cache the first chunk of videos. Encoded with `-movflags faststart` this typically ensured that the moov chunk was cached at an edge, which dramatically decreased the wait time of video playback (supporting arbitrary seeking of large/long video file)