This is awesome! I've been waiting for Aphyr to publish his analysis of RethinkDB as this is a project I carried out in a recent distributed systems class that I took. Although our analysis[0] is not as comprehensive (or correct?) as Aphyr's we still managed to learn quite a lot.
If you are looking at using Jepsen to do your own analysis, I have a few takeaways that might be worth sharing -
- Have a passable knowledge of Clojure
- Get a beefy workstation. We used a 160 gig EC2 instance[1] and still couldn't get Knossos (the linearizability checker) to complete for longer runs.
- Use the Docker-in-Docker setup[2] to minimize the frustration
- Pick an existing system closest to the system you want to analyze and see Aphyr's version of the tests for guidance and overall flow. The code is well commented and you should be able to follow through.
All in all, our major takeaway was that Jepsen is a wonderful (albeit complex) piece of software that takes time to get up and running. Once you are past that though, it stands as a very complete testing tool in itself.
Sincere thanks to Aphyr for open-sourcing it and helping us with our project!
Great to finally see a write up by Aphyr on RethinkDB. Ever since reading these blogs and seeing RethinkDB lost standing Github issue [1] I was keen to hear how RethinkDB would hold up to the tests once RAFT was implemented.
Given the thoroughness of the Jepsen test suite it is something people want to see these days before being able to choose a database with any confidence. Hopefully this sets out expectations with high transparency.
Kudos to the team at RethinkDB for funding and assisting Apyhr in his work.
> Given the thoroughness of the Jepsen test suite it is something people want to see these days before being able to choose a database with any confidence.
Definitely agreed on the "something people want to see" part, but this is this is the thing... Jepsen isn't actually that thorough[1]. I rather think that this is an indictment on the state of "practical" distributed computing as it currently stands that a "simple" test for linearizability (nowadays) or even simple CAS (which I believe Jepsen started out testing) in a partitioned system would turn up such a huge amount of badly implemented distributed systems and... frankly dishonest documentation around those systems.
It's not that I could necessarily do any better -- except maybe the "honesty" part, or at least adding lots of qualifiers -- I just find it a bit... sad in a way that we haven't come farther. Still, it is a young field, so there may be grounds for optimism for the future. (Thinking of e.g. dependent type systems coming together with model checking coming together with verified model->machine translation, chips with verified semantic models, etc. etc.)
[1] As Aphyr explicitly states, it's actually very limited in the state space that it can explore simply because it's constrained to be "external" to the system being tested. Model checkers can do much more -- but then you usually don't get a fully automatic and verified translation to machine code... and who knows if you've modeled the CPU/IOMMU/etc. correctly anyway?
EDIT: Btw, Aphyr deserves a HUGE amount of praise. AFAIK he's the only person so far who has stepped up to the plate and "dared" to actually test this stuff. It's kind of amazing in a way, but I think I'll blame corporate culture for this sort of thing... "Oh, the documentation says $X therefore we'll believe $X. At least they can't fire us for that." Not surprisingly I was known to be hugely skeptical of any claims made of distributed systems, but I was too much of a coward to embark on "Aphyr's Quest" :). I'm hoping to coin a phrase with that last bit.
> Consistent with the documentation, I have never found a linearization failure with these settings. If you use hard durability, majority writes, and majority reads, single-document ops in RethinkDB appear safe.
I believe that Zookeeper + Curator were entirely consistent with their documentation as well as linearizable: https://aphyr.com/posts/291-jepsen-zookeeper . Now, of course, Zookeeper is built as a state management / coordination service more than a full-fledged database system, so it's not quite apples to apples, but it did offer a similarly strong showing.
I think ElasticSearch is also now consistent with its documentation... although the documentation basically says "here be dragons."
As an aside, re-reading these makes miss the old Aphyr work before he worked on contract. The goofy image memes and 100% irreverent tone made for good times.
My last two projects have been developed with RethinkDB as the backend, and personally I love it. I tried playing with Mongo for a bunch of projects, and I just didn't enjoy it's shortcomings. Coming from years of MySQL, and other relational databases, RethinkDB gives me the best of both worlds. A relational query language, with the flexibility of schema less tables. Change feeds make real time applications super easy, and the admin GUI is awesome. It's essentially the next evolution of databases, developed by people who know what they are doing. You'll also definitely want to grab Thinky.io ORM when working in Node.js
I'm consistently impressed with the results that Aphyr publishes. Always very thorough and as balanced as I've ever seen regarding database testing.
In addition, I really appreciate the approaches that RethinkDB team have taken. Their approach to features and growing the product have been impressive and well calculated. I've followed several issues, and know that they work well on changes that take multiple steps and iterations to achieve (automatic failover, for example). I'm hoping to have more opportunity to use their product in the future (out of my hands at my current position).
I think in 3-5 years consultants will be making a killing converting projects from one of 100(s) NoSQL flavors to RDBMS or few surviving NoSQL flavors.
I'm already carving out a niche on the east coast with Merb and Rails 2 systems. Every time I think the gold rush is over, another gift falls into my lap. Keep on keepin' on, novelty kings!
RethinkDB is great. It's the first NoSQL database I've used so far which I actually liked. The query language is sane and the developers definitely know what they're doing. The realtime features are amazing and work very well with Python + asyncio.
Biggest difference is RethinkDB is a JSON document store, and Cassandra is column-oriented. Cassandra is eventually-consistent, and RethinkDB is immediately consistent with all writes going through a primary node for each table.
RethinkDB also has a query language that's integrated very deeply with the driver language itself, vs CQL which is a SQL-like text string.
I'd bet this is due to a "smart typography" processor that converts runs of hyphens to em- or en-dashes.
The standard typing convention for decades has been to use "--" to indicate an em-dash, and that's certainly what I've learned. Many smart converters (including SmartyPants, the one John Gruber created as a companion to Markdown) make that conversion. However, there's also the TeX convention of converting "--" to an en-dash and requiring "---" for an em-dash; some processors will use that standard by default, like Pandoc and MultiMarkdown. If you're typing "--" for em-dash but you're unknowingly using a processor that turns that into en-dashes, well, you get this.
(At some point, Tumblr converted their internal Markdown processor from "--" for em-dash to "--" for en-dash, and hundreds of my tech blog posts now look stupid because all the em-dashes have become em-dashes. Thanks, Tumblr.)
nice one! i've seen the same problem but didn't know why (always reluctant to eliminate myself as the root cause). Agreed that re-wiring 20+ year old print-to-digital-symbol conversions on one side then doing it again on the other side--all under the hood is a questionable design decision.
Great write-up and delightful to see RethinkDB performing well. I think it is a truly fantastic database to use. Easy installation, the built-in HTTP admin, and query language make ad-hoc data querying fun and smooth.
I really wanted to try rethink db in one of my side projects, which is gonna be a SaaS product. What are the best practices/ support for multi tenancy, with rethink?
The front end will mostly be Rails (I currently use apartment gem with postgres).
If you are looking at using Jepsen to do your own analysis, I have a few takeaways that might be worth sharing -
- Have a passable knowledge of Clojure
- Get a beefy workstation. We used a 160 gig EC2 instance[1] and still couldn't get Knossos (the linearizability checker) to complete for longer runs.
- Use the Docker-in-Docker setup[2] to minimize the frustration
- Pick an existing system closest to the system you want to analyze and see Aphyr's version of the tests for guidance and overall flow. The code is well commented and you should be able to follow through.
All in all, our major takeaway was that Jepsen is a wonderful (albeit complex) piece of software that takes time to get up and running. Once you are past that though, it stands as a very complete testing tool in itself.
Sincere thanks to Aphyr for open-sourcing it and helping us with our project!
[0] - https://github.com/prakhar1989/ADS
[1] - https://twitter.com/prakharsriv9/status/675049396636110848
[2] - https://hub.docker.com/r/tjake/jepsen/