Are there published benchmarks for multi-key operations and more complex SELECT statements? I apologize if I missed them.
I'm trying to determine whether there's a place for Cockroach within what I think are the constraints in the database space.
* Traditional SQL Databases
- Go to solution for every project until proven otherwise.
- Battle tested and unmatched features.
- Hugely optimized with incredible single node performance.
- Good replication and failover solutions.
* Cassandra
- Solved massive data insert and retention.
- Battle tested linear scalability to thousands of nodes.
- Good per node performance.
- Limited features.
It seems like many new databases tend to suffer from providing scale out but relatively poor per node performance so that a mid-size cluster still performs worse than a single node solution based on a traditional SQL database.
And if you genuinely need huge insert volumes, because of the per node performance you'd need an enormous cluster whereas Cassandra would deal with it quite comfortably.
[Cockroach Labs engineer here working on performance benchmarking]
We have load generators for YCSB (just raw key-value ops in a firehose) and TPC-H (very complicated read-only queries) running right now, and we're about to start running TPC-C queries (moderately complex queries in large volume) as well. You can follow along on our progress here: https://github.com/cockroachdb/loadgen
In the context of your dichotomy, we want to bridge that gap. We want the linear scalability of your second group along with the full feature-set of the first group.
We will be publishing our performance numbers, but we haven't so far because the product has improved rapidly, and our numbers have been quickly obsoleted, but rest assured, we will be publishing a series of blog posts very soon. Anecdotally, our beta customers are not finding that they need very many more CockroachDB nodes than their existing database solutions, even with something as high-performant (but inconsistent) as Cassandra.
I'm trying to determine whether there's a place for Cockroach within what I think are the constraints in the database space.
* Traditional SQL Databases
* Cassandra It seems like many new databases tend to suffer from providing scale out but relatively poor per node performance so that a mid-size cluster still performs worse than a single node solution based on a traditional SQL database.And if you genuinely need huge insert volumes, because of the per node performance you'd need an enormous cluster whereas Cassandra would deal with it quite comfortably.