Ask HN: I have a question regarding strong consistency. The question is related to all geo distributed strong consistent storages, but let me pick Google Spanner as example.
Spanner has a maximum upper bound of 7 milliseconds clock offset between nodes, thx to TrueTime. To achieve external consistency, Spanner simply waits 7 ms before committing a transaction. After that amount of time the window of uncertainty is over and we can be sure, that every transaction happens to be in the right order. So far so good, but how does Spanner deal with cross data center network latency? Let's say I commit a transaction to Spanner in Canada and after 7 ms, I get my confirmation, but now in Australia someone also does a conflicting transaction and also get his confirmation after 7 ms. Spanner however, bound to the laws of physics, can only see this conflict after 100+ ms delay, due to network latency between the datacenters. How is that problem solved?
The simple answer is that there are round trips required between datacenters when you have a quorum that spans data centers. Additionally, one replica of a split is the leader, and the leader holds write locks. So already you have to talk to the leader replica, even if it's out of the DC you're in. Getting the write lock overlaps with the transaction commit though. So for your example if we say the leader is in Canada and the client is in Australia, and we're doing a write to row 'Foo' without first reading (a so called blind write):
Client (Australia) -> Leader (Canada): Take a write write lock on 'Foo' and try to commit transaction
Leader -> other replicas: Start commit, timestamp 123
Other replicas -> Leader: Ready for commit
Leader waits for majority ready for commit and for timestamp 123 to be before the current truetime interval
Leader -> Other replicas: Commit transaction, and in parallel replies to client.
Of course there are things you can do to mitigate this depending on your needs, but there's no free lunch. If you have a client in Australia and a client in Canada writing to the same data you're going to pay for round trips for transactions.
Writes are implemented with pessimistic locking. A write in a Canada datacenter may have to acquire a lock in Australia as well. See page 12 of the paper for the F1 numbers (mean of 103.0ms with standard deviation of 50ms for multi-site commits).
Spanner has a maximum upper bound of 7 milliseconds clock offset between nodes, thx to TrueTime. To achieve external consistency, Spanner simply waits 7 ms before committing a transaction. After that amount of time the window of uncertainty is over and we can be sure, that every transaction happens to be in the right order. So far so good, but how does Spanner deal with cross data center network latency? Let's say I commit a transaction to Spanner in Canada and after 7 ms, I get my confirmation, but now in Australia someone also does a conflicting transaction and also get his confirmation after 7 ms. Spanner however, bound to the laws of physics, can only see this conflict after 100+ ms delay, due to network latency between the datacenters. How is that problem solved?