My first thought upon seeing the prompt: If you would build an in-memory cache, ...

mannyv · on Oct 9, 2024

I've been doing this so long that my first thought was "use redis."

Why?

* it works

* it's available now

* it scales

* it's capable of HA

* it has bindings for every language you probably want to use

Why bother writing your own cache, unless it's for an exercise? Cache management is complicated and error prone. Unless the roundtrip kills you just use redis (or memcached).

NovaX · on Oct 10, 2024

In a typical LRU cache every read is a write in order to maintain access order. If this is a concurrent cache then those mutations would cause contention, as the skewed access distribution leads to serializing threads on atomic operations trying to maintain this ordering. The way concurrent caches work is by avoiding this work because popular items will be reordered more often, e.g. sample the requests into lossy ring buffers to replay those reorderings under a try-lock. This is what Java's Caffeine cache does for 940M reads/s using 16 thread (vs 2.3B/s for an unbounded map). At that point other system overhead, like network I/O, will dominate the profile so trying to rearrange the hash table to dynamically optimize the data layout for hot items seems unnecessary. As you suggest, one would probably be better served by using a SwissTable style approach to optimize the hash table data layout and instruction mix rather than muck with recency-aware structural adjustments.

The fastest retrieval will be a cache hit, so really once the data structures are not the bottleneck then the focus should switch to the hit rates. That's where the Count-Min sketch, hill climbing, etc. come into play in the Java case. There's also memoization to avoid cache stampedes, efficient expiration (e.g. timing wheels), async reloads, and so on that can become important. Or if a dedicated cache server like memcached, one has to worry about fragmentation, minimizing wasted space (to maximizing usable capacity), efficient I/O, etc. because every cache server can saturating the network these days so the goals shifts towards reducing the operational cost with stable tail latencies. What "good performance" means is actually on a spectrum because one should optimize for overall system performance rather than any individual, narrow metric.

bespoke_engnr · on Oct 9, 2024

I think splay trees would be good for this: https://en.m.wikipedia.org/wiki/Splay_tree

withinboredom · on Oct 10, 2024

You should check out the FASTER paper from Microsoft. It specifically covers how to create a K/V log that spills to disk for older keys, but keeps recent keys in memory.

akoboldfrying · on Oct 9, 2024

https://en.m.wikipedia.org/wiki/Splay_tree