> clients can do globally consistent reads across the entire database without lo...

jdcarr · on Feb 14, 2017

> In terms of CAP, Spanner claims to be both consistent and highly available despite operating over a wide area, which many find surprising or even unlikely. The claim thus merits some discussion. Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA. The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system.

jhugg · on Feb 14, 2017

I would expect more from Brewer.

"CA except when there are partitions" is CP. It's not "effectively CA".

xapata · on Feb 14, 2017

No, he's saying it's effectively CAP because the A downtime is so small.

It's one thing to do that for a key-value store. Entirely another to support joins on a globally distributed database. This ain't just one availability zone. Spanner is amazing.

It took them a few years to make it a service, but when they announced its use internally a few years ago, it seemed like the nail in the coffin for in-house database hosting.

jhugg · on Feb 14, 2017

I understand what he's saying. It's marketing.

There's nothing wrong with saying it's CP, but since we control everything there's extremely rare P. Then he can show availability numbers (which he kinda does).

Saying it's "effectively CA" defeats the point of the CAP theorem, which says you have to make tradeoffs. See: https://codahale.com/you-cant-sacrifice-partition-tolerance/

xapata · on Feb 14, 2017

> It's marketing.

No, it's engineering. It's the recognition that if periods of unavailability are too small and too rare to be noticed, then the system behavior is indistinguishable from an "available" system in the sense of the CAP theorem.

It's like the "Retina" display you're probably reading from. There are pixels, you just can't see them.

noahl · on Feb 14, 2017

Another point is that since all records are globally timestamped, you can do a read that is consistent at a timestamp in the past (i.e. read data as the database was 1 second ago, or something like that).

If data from other places has synchronized to your zone, you may be able to do this globally-consistent read while only touching your local datacenter (because TrueTime guarantees that no other records anywhere in the system will be created at the time you are querying).

Note: I work at Google, but I don't know more about Spanner than the Spanner paper.

xapata · on Feb 14, 2017

Check out the papers. They revealed Spanner a few years ago. Other commenters have provided links.