The merging happens primarily on hot data, I haven't run into any issues there. ...

The merging happens primarily on hot data, I haven't run into any issues there.

But there are lots of approaches, depending on your needs.

You can (should) define a "cache disk" for S3, which will cache up to X Gb locally to avoid trashing.

Another option is is to move data into separate (purely S3 backed) tables after a certain time to avoid accidentally fetching large amounts of data from S3. You can still easily join the data together if needed.