The merging happens primarily on hot data, I haven't run into any issues there.
But there are lots of approaches, depending on your needs.
You can (should) define a "cache disk" for S3, which will cache up to X Gb locally to avoid trashing.
Another option is is to move data into separate (purely S3 backed) tables after a certain time to avoid accidentally fetching large amounts of data from S3.
You can still easily join the data together if needed.
But there are lots of approaches, depending on your needs.
You can (should) define a "cache disk" for S3, which will cache up to X Gb locally to avoid trashing.
Another option is is to move data into separate (purely S3 backed) tables after a certain time to avoid accidentally fetching large amounts of data from S3. You can still easily join the data together if needed.