Adding Context to CockroachDB’s Article “Your Database Should Work Like a CDN”

lykr0n · on Feb 8, 2018

My problem with CockroachDB comes down to two major issues:

1. I need a read slave. With MariaDB Galera, I can slave a database of a local master, but Cockroach AFAIK can only do master master. Adding another node to a cluster that will just be read from seems like a lot of unneeded overhead, and can cause massive issues if there are sync issues.

2. WAN sucks. Where I work, I've seen mysql slaves get hours out of sync due to packet loss. With a centralized system, I don't have to worry about reconciling data. If every site can accept data, then I have to assume that every site will remain in sync. If Site 1 gets out of sync and/or stops transmitting data, another site could make further changes to the record in question causing issues. Some places this is perfectly fine, but dealing with transactions of counting clicks, I do not want to be dropping data or running into conflicts.

> Sharded Relational databases come in many shapes and suffer from as many different types of ailments when deployed across regions: some sacrifice replication and availability for consistency, some do the opposite.

I think this is a great point. You have to know what your database is doing, and how it does that. It seems to becoming more common for people to develop on something without know how it works; just to get bitten down the road. Any problem is going to get exasperated when you introduce it to the wonderful world of WAN.

I'm not ragging on the product; it's a great good enough product for most use cases. I just more wanted to share my thoughts. If you can tolerate partial consistency, go for it. It a great run and done product.

> Only companies with very sophisticated operations teams can seriously set an SLA for five nines. It’s doable, but comes with a heavy lift if your service is doing anything non-trivial. It’s certainly not the default position among the industry as far as I can tell.

Five 9s is do-able for a small group of you work from the ground up (infrastructure to application). Dual site Postgres Citus with a slave coming off the Citus cluster, sitting in-front of a pgpool-II load balancer. Then replicate a slave out to each site you are running in the world. If you are running an API out there, use rabbitmq or kafka to relay transactions back to one of the sites or accept the latency and run raw commands over the wire.

Five 9s when you are trying to drop in a solution is extremely hard, so I give a major kudos of CockroachDB for making that reachable with a drop in solution.

jasonwatkinspdx · on Feb 8, 2018

> 1. I need a read slave.

Have you actually tested this need? It strikes me as unfair to assume a solution that works well for Galera is also necessary for Cockroach. They have quite different architectures. CDB allows you to mix serializable and snapshot transactions. The latter should be quite efficient for read only workloads. You can create a user that only has only select privileges granted just like with most other SQL databases. I would not assume you need a 2nd read only daemon instance per node just based on your experience with another product.

> 2. WAN sucks. ... If every site can accept data, then I have to assume that every site will remain in sync.

This is not how CockroachDB works (or spanner style systems in general for that matter). You do not have to assume: CDB's replication is linearizable. You will not see things out of sync, and the scenario you describe will not happen. The worst you have to live with is the system becomes unavailable if you lose more nodes in a single replica group than can be tolerated with your quorum settings (ie, 1 out of 3 or 2 out of 5).

> I'm not ragging on the product; it's a great good enough product for most use cases. I just more wanted to share my thoughts. If you can tolerate partial consistency, go for it.

This is literally the opposite of the entire point of cockroach's design. You do not have to tolerate partial consistency. It's fully consistent. That's the whole point. I hope people reading along see more than your comment, otherwise they're going to walk away thinking cockroach is exactly the opposite of what it is.

uluyol · on Feb 8, 2018

I'm just trying to understand this better.

> 1. I need a read slave. With MariaDB Galera, I can slave a database of a local master, but Cockroach AFAIK can only do master master. Adding another node to a cluster that will just be read from seems like a lot of unneeded overhead, and can cause massive issues if there are sync issues.

I don't know whether CockroachDB can do this or not, but there's no reason raft followers can't serve stale reads. Given that, I don't understand this concern.

> 2. WAN sucks. Where I work, I've seen mysql slaves get hours out of sync due to packet loss. With a centralized system, I don't have to worry about reconciling data. If every site can accept data, then I have to assume that every site will remain in sync. If Site 1 gets out of sync and/or stops transmitting data, another site could make further changes to the record in question causing issues. Some places this is perfectly fine, but dealing with transactions of counting clicks, I do not want to be dropping data or running into conflicts.

The whole point of CockroachDB is that you never see inconsistencies (at least not without opting in to it). But if you are unable to talk to any of the other replicas then you would have to be dropping data. You don't really have a choice here, if you have a network partition, pick unavailability or consistency. This isn't something Galera can fix either.

lykr0n · on Feb 8, 2018

1. In my research, raft followers would still be thinking for themselves. Each node could potentially become master, which would be fine in most cases, but I just want a dum stupid slave database to pull a read only copy. My concern is latency and load. I want to be able to predict hotspots. I don't want api-node-3 to become the leader for that site, I want db-1 and db-2 to be where everything happens.

jasonwatkinspdx · on Feb 8, 2018

> In my research, raft followers would still be thinking for themselves.

There is a lot more going on than vanilla raft. Cockroach's range lease feature (which is NOT the same as the raft leader election) automatically moves the lease to the node that minimizes latency. You can control this behavior via meta data and matching rules. I'd suggest skimming: https://www.cockroachlabs.com/docs/stable/demo-follow-the-wo...

Again, you're making a lot of assumptions that seem rooted in "cockroach == a raft cluster". There's a lot more going on than that.

felixge · on Feb 8, 2018

> 2. WAN sucks. Where I work, I've seen mysql slaves get hours out of sync due to packet loss.

Have you tried the new BBR congestion control for TCP?

We've been seeing similar issues with PostgreSQL WAN replication and found that our TCP connections get throttled prematurely by the Cubic algorithm way before (0.5Mbps) the available bandwidth (~20Mbps) is reached and actual congestion kicks in.

From our research it seems that switching to BBR on our replication origins should completely solve this problem, but we haven't had a chance to put it into production yet (gotta upgrade Kernels first ...).

I'm just throwing this out there as an idea for you, but perhaps also because I hope to hear from those who have already had a chance to apply BBR to DB replication over pipes with elevated (0.1-1%) packet loss.

lykr0n · on Feb 8, 2018

I'm still in the progress of setting up my product, but my day job does mysql replication. The issues come from when there is a good 5 minutes of 80% packet loss. It's caused some major issues.

I'll keep that in BBR congestion control in mind`

jacksmith21006 · on Feb 8, 2018

Here is a great paper on Spanner which was the inspiration for CockroachDB

https://static.googleusercontent.com/media/research.google.c...

luxpsycho · on Feb 8, 2018

> on it’s merits.