> Cockroach is a distributed key:value datastore (SQL and structured data layers of cockroach have yet to be defined) - emphasis mine
I guess this is interesting, but distributed hard consistency pure K-V stores have been done before, Zookeeper, etcd, etc. It seems like the vast majority of the hard work is left to do. I don't want to get into naming arguments, but I wouldn't really call this a 'database' yet. It doesn't sound like you can do anything but a key lookup or range query currently, which is incredibly limiting for most real world applications.
I somewhat question the approach. e.g. why not figure out the hard part first? i.e. build the `SQL and data layers` on top of zookeeper or etcd then replace the backend to scale better? I would think this would get a lot more early adopters. As is, it's a very niche usage case that the alpha fills.
If you look at the documentation (eg., [1]), the design has been rather carefully thought out; it's just that they're implementing it from the bottom up.
According to their roadmap [2], they're aiming for KV functionality in 1.0 and aren't aiming for SQL until past version 2.0 (it's currently alpha).
Given the backgrounds of the technical people involved (including Google, as this project is inspired by Spanner), they should have a lot of experience with what they're trying to accomplish.
As for "done before", a core feature of Cockroach is true ACID transaction support, including snapshot isolation, something no distributed NoSQL database I know about supports. (ArangoDB does support transaction, but is mostly NoSQL in the sense of implementing a different query language than SQL.)
Exactly right. The hard part is building a key-value store with a powerful notion of transactions (not just compare-and-set or the like), and that's what's mostly done. Structured data is still work, but on the shoulders of giants.
> As for "done before", a core feature of Cockroach is true ACID transaction support, including snapshot isolation, something no distributed NoSQL database I know about supports.
Zookeeper has ACID transactions which I believe are linearizable (which trumps SI). The downside is the memory only working set, but given how cheap memory is, I'd still rather have a memory only Zookeeper with a rich query interface than a large storage data KV store with minimal query interface.
> ArangoDB does support transaction, but is mostly NoSQL in the sense of implementing a different query language than SQL
ZooKeeper is not a general-purpose database. I have heard of anyone using it as one, either.
> What is your definition of NoSQL?
I don't have one, and I think the term isn't terribly useful. But the whole idea of NoSQL started as an attempt to break free of the relational aspect of SQL, because things like joins, strict schemas, foreign keys, and normalization were perceived as getting in the way of distribution. ArangoDB supports joins (but not foreign keys, because it's schemaless) and an SQL-like query language, which makes it a lot closer to an SQL database than something like Redis or Cassandra.
I guess this is interesting, but distributed hard consistency pure K-V stores have been done before, Zookeeper, etcd, etc. It seems like the vast majority of the hard work is left to do. I don't want to get into naming arguments, but I wouldn't really call this a 'database' yet. It doesn't sound like you can do anything but a key lookup or range query currently, which is incredibly limiting for most real world applications.
I somewhat question the approach. e.g. why not figure out the hard part first? i.e. build the `SQL and data layers` on top of zookeeper or etcd then replace the backend to scale better? I would think this would get a lot more early adopters. As is, it's a very niche usage case that the alpha fills.