It seems every few years I read about another KVS (half of the time from academia) purporting to have desirable properties previously not found.
As someone who's never had to work with distributed data structures, I'd love to see an article comparing and contrasting them. There seems to be a whole lot of these things, so they can't all be the same. Off the top of my head, there's RocksDB, Aerospike, FoundationDB, ZippyDB, Harvard published "Monkey", which is apparently an "optimal" key value store, and I'm sure many more. What is it that keeps improving?
I'd also love to read something about other distributed systems/datastructures. Surely we need more than key value stores?
The post touches on the underlying problem. Distributed stuff is fundamentally very difficult. As Jepson has shown so definitively, getting these protocols right is very unlikely even for very experienced developers. Small changes can destroy correctness, or influence the assumed consistency / isolation level in surprising ways.
The NoSQL fad embraced one extreme: the simplicity and performance of Eventual Consistency, essentially pushing any complexity that creates onto the application. At the other extreme, the Spanner paper explicitly argues for prioritizing correctness first, using a conservative transaction design more straightforward to implement and use, at the cost of potentially much worse performance under high contention.
Both approaches have merits, but I think it's clear they aren't one size fits all solutions. Or even fits most. Hence the explosion in systems trying to find some balance between these extremes. With time the picture should resolve. The early days of rdbms's went through a similar explosion then consolidation down to the most effective algorithms.
As someone who's never had to work with distributed data structures, I'd love to see an article comparing and contrasting them. There seems to be a whole lot of these things, so they can't all be the same. Off the top of my head, there's RocksDB, Aerospike, FoundationDB, ZippyDB, Harvard published "Monkey", which is apparently an "optimal" key value store, and I'm sure many more. What is it that keeps improving?
I'd also love to read something about other distributed systems/datastructures. Surely we need more than key value stores?