Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Riak is actually a distributed datastore, where as CouchDB & MongoDB are replication based.

For example, there are no special nodes in Riak and no one node has all the data. When a node joins a riak cluster, it begins to share and participate in the cluster.

CouchDB replicates the entire dataset from to another, in MongoDB you need to do sharding (which will be implemented in the 1.1 release from what I understand).

Both are Couch & Mongo are pretty awesome kv stores, but neither implement a distributed datastore like Riak (or Cassandra).



How is cross system locking of nodes handled? To write or read, do all nodes need to be in sync?


Are you familiar with the CAP theorem? You can't have all three, but you can have 2. Most distributed datastores, pick for you which two of the three you'll get.

One of Riak's goals is to make the choice transparent to the user per document and let's the user select which they want. There's some more information about that on their website: http://riak.basho.com/cap.html

EDIT: this is good info about it too http://riak.basho.com/nyc-nosql/


The reason I asked is you brought up CouchDB isn't continuous where yours is, but you approach CAP in the same way CouchDB does. So the difference between continuous and scheduled replication is mostly syntax sugar. CouchDB can be cron scheduled to replicate across or pull from continuously in a similar fashion for similar functionality.


Hmm, it's not really the difference between continuous and scheduled... Basically, Couch is a shared-nothing system, meaning, it retains all the data on one node. Then you can easily replicate (say via scheduled task w/ cron) the data from node to node.

Riak is a distributed system, where each node that joins becomes part of a cluster of nodes and shares everything (data, work, etc).


Ok, thanks for clarifying :)


So this isn't a shared-nothing system? How do they handle failing nodes, and prevent data loss?


Riak creates vnodes, or virtual nodes for each node. These vnodes make up what's called a ring server. So you may have 1024 vnodes in say, 4 bare-metal nodes.

I pointed this presentation out below, but I'll point it out again, because it's a really good presentation to understand riak: http://riak.basho.com/nyc-nosql/


Thanks for the link, bonus points for the video being on vimeo!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: