This sounds too flattering toward Riak; it almost sounds like an ad rather than ...

nirvana · on Sept 30, 2011

I think that Basho deserves congratulations for moving the ball forward in the NoSQL space. If CouchDB had just achieved something similar, I'd have written a similar though admittedly shorter post.

By definition, every NoSQL solution has the downside of not having SQL. Given the popularity of SQL, this immediately rules them out for a lot of people.

Compared to other solutions, Riak, obviously is advantaged from my perspective, given what I value, which is why I chose it.

I'm building a (soon to be open sourced) web development platform on top of Riak.

I am not interested in debating the merits of other solutions, so I won't participate in followups when people disagree with what I say below. That's fine, everyone values different things and has different priorities for various features to solve their different problems.

Here's how Riak compares to the competition, from my perspective:

CouchDB-- CouchDB supports replication from any couchDB to any other CouchDB with the changes feed. This is a killer feature, and one that Riak doesn't really have. With Cloudant, they have taken CouchDB and spread it over a dynamo style Ring, which makes it, in some ways, similar to Riak. CouchDB essentially pre-computes its views, which makes it not a good match for my purposes (which is why I started looking elsewhere in the first place... in fact it was couchDB using erlang_js, a basho library that was my first exposure to Basho.) I think CouchDB is probably a great database for a lot of purposes, but to be honest, after the merger with membase it seems like CouchIO/CouchOne disappeared into a puff of marketing terms and I haven't been able to make heads or tails of what's changed with it over the last year or so.

Cassandra-- Looked into it a couple times, tried to make heads or tails of it, couldn't really, possibly because I'm looking for a document oriented database to begin with. Didn't like the scalability story either. I don't know if I'll ever need to scale or not. I 'm going to start with a cluster that is small that fits my duplication/safety requirements more than anything... but if we do need to scale, I know that re-architecting things is the last thing I'm going to want to be worrying about, as there's going to be many other things to deal with.

MongoDB-- Have never understood the appeal of this. They choose speed over robustness (which is the opposite of what I would choose) and their scalability story is not the no-brainer, no-thought, just-add-a-server don't-worry-about-it approach that I think is important. I'm sure mongoDB is faster than riak on a single node. But scaling from 1M requests a day to 100B requests a day will be much easier and faster (in terms of development time and headaches) with Riak... at least that's what I believe.

Hadoop-- A big old rambling project, and a cluster of open source solutions. Whatever you want to do with hadoop, someone's done it, and if its at all common, 8 people have done it in slightly different ways. I think PIG is really wonderful, and given the release of 1.0, something like Pig is the only big feature that Basho hasn't really addressed... (you mean I have to write my queries in erlang, bob?) but operationally, hadoop is too confusing, too much of a moving target and too many decisions that don't fit my personal style. (for many years Java was my favorite language, but I have to admit I'm an erlang snob these days. If you're not writing a distributed platform in erlang, the first thing I'm going to want to know is why, and the rest of the evaluation will suffer under that cloud. I'm more proficient in java than erlang, but I'm comfortable looking at Riak internals, while the thought of looking at hadoop internals fills me with great dread. In well designed erlang programs, a module is a single file, and doesn't have a lot of dependancies... I imagine the same functionality in hadoop will be spread across dozens of classes, though I might be wrong.)

SQL-Anything-- I need just-add-serviers-and-don't-worry-about-it robustness and scalability. I don't want to think about sharding, or architecting my applications to support the database... the database should STFU and do its job, and grow with me adding servers. I don't have the budget for an ops guy. I need to run without an ops guy for quite awhile. SQL itself really doesn't add anything for me.

So, in summary, a couple alternatives have nice features that Riak could use, but there aren't any real disadvantages or downsides when compared to them for Riak... at least based on what's valuable to me. (which is minimizing my development time and greatly minimizing my time wearing the operations hat.)

Edited: Fixed where I mistakenly typed "CouchDB" (the apache project) when I meant "CouchIO/CouchOne" (the company.) Also upgraded CouchDB's replication from "nice" to "killer" which more accurately represents my opinion of it.

jchrisa · on Sept 30, 2011

I'm a cofounder at Couchbase. All the work we do is open source, and we contribute heavily back to the Apache CouchDB project.

Pardon our marketing dust... it's nothing compared to the actual engineering we've been doing, but that's a story for another day.

Congrats to Basho for reaching the 1.0 milestone!

rnewson · on Sept 30, 2011

CouchDB did not merge with membase. You are thinking of Couch One, the company. CouchDB remains an Apache project.

rdtsc · on Oct 1, 2011

Please, let's not be pedantic.

Yes, CouchDB is an Apache project and Couchbase Single Server is a Couch One product. Except that they are almost exactly the same. Couchbase Single Server is already pre-built into easily installable rpms, it has GeoCouch integrated, but not a whole lot more compared to CouchDB trunk (Am I wrong?).

So as a developer you want to play around with Couch. Which one do you pick? ... Exactly. Aside from terminology there is basically a fork at the moment and I understand that CouchOne Server eventually will combine membase + CouchDB in it, and it is a great step forward but currently it makes it a bit harder. It is also not exactly easy to find a comparison of exactly which features are in which. So it is sort of a guess work.

pablochacin · on Oct 4, 2011

Translating Pig like high level data flow jobs to Riak's pipe/erlang would be a major endeavor, but definitely one that would worth the effort.