I'm happy you brought up clustering. Internally we've been quite frustrated with this part of the product, but until a few months ago we held off the development on it for two reasons: we wanted to collect more information from users on real use cases and behavior, and there were more immediate bottlenecks in the product.
We restarted heavy development on the clustering infrastructure two months ago, and just yesterday I played with the prototype of the first upcoming upgrade. It's a WIP but is absolutely delightful (you can see my tongue in cheek review of it at https://github.com/rethinkdb/rethinkdb/issues/2957).
Here are the parts that are already done and will be shipped soon:
- Vector clocks conflicts are now resolved automatically, no
more manual conflict resolution
- There is now a ReQL API for clustering that's dramatically
better than the current `rethinkdb admin` tool
- Much love has been put into presenting the abstractions to
users. Everything is cleaned up and simplified, it's easier
to understand and change, and even in advanced cases you
won't have to know anything about blueprints/semilattices.
- Really, this is about to get dramatically better. I can't
summarize it in a bullet point, we put an enormous amount of
effort into this in a thousand different places.
Here's what's coming immediately after that:
- Automatic failover
- Always-on resharding (no more resharding downtime)
(The reason why these latter updates are coming after the API overhaul is because they require a lot of simplification/refactoring/redesign internally as well as externally, and we wanted to do it piecemeal).
Thanks for writing up your feedback and sticking with RethinkDB despite the limitations of clustering 1.x. Multiple people are currently working very hard on this, and things are about to get a lot better.
This makes me very happy to hear. I've been using RethinkDB for a few months now and I love it, but the manual failover has made me a bit uneasy.
Any chance of a clue as to the kind of timeframe for this? Even a very rough idea would be fine as I can appreciate you might not want to (or be able to) commit to anything yet.
I think we'll be able to ship the new clustering API in ~two months (note, it's a huge and a massively delightful change). I'm hoping we'll be able to get failover out two months after that, but it's hard to give precise estimates looking that far out.
We restarted heavy development on the clustering infrastructure two months ago, and just yesterday I played with the prototype of the first upcoming upgrade. It's a WIP but is absolutely delightful (you can see my tongue in cheek review of it at https://github.com/rethinkdb/rethinkdb/issues/2957).
Here are the parts that are already done and will be shipped soon:
Here's what's coming immediately after that: (The reason why these latter updates are coming after the API overhaul is because they require a lot of simplification/refactoring/redesign internally as well as externally, and we wanted to do it piecemeal).Thanks for writing up your feedback and sticking with RethinkDB despite the limitations of clustering 1.x. Multiple people are currently working very hard on this, and things are about to get a lot better.