Hacker News new | past | comments | ask | show | jobs | submit login
Introducing Cloud Spanner, a Global Database Service (googleblog.com)
1068 points by wwilson on Feb 14, 2017 | hide | past | favorite | 434 comments



Congratulations to the Spanner team for becoming part of the Google public cloud!

And for those wondering, this is why Oracle wants billions of dollars from Google for "Java Copyright Infringement" because the only growth market for Oracle right now is their hosted database service, and whoops Google has a better one now.

It will be interesting if Amazon and Microsoft choose to compete with Google on this service. If we get to the point where you have databases, compute, storage, and connectivity services from those three at equal scale, well that would be a lot of choice for the developers!


> It will be interesting if Amazon and Microsoft choose to compete with Google on this service. If we get to the point where you have databases, compute, storage, and connectivity services from those three at equal scale, well that would be a lot of choice for the developers!

There are also plenty of choices evolving for developers who aren't looking for hosted solutions (which can sometimes be a showstopper for enterprise on-prem deployments). There's a growing ecosystem of distributed open-source databases to look out for too.

Take Citus, for instance – a Postgres-compatible distributed store which automatically parallelizes normal SQL queries across machines. It's as easy to set up as adding an extension, and people are doing some staggering things in prod with it.

Different audience from BigQuery and Spanner, but no less exciting.

Disclaimer: no professional association, but love their product and the team.


Craig from Citus here. Thanks for the kind words. We've seen a lot of people scale-out transactional workloads with Citus as well. In particular, we've seen a lot of multi-tenant apps that need to keep scaling beyond a single node when they're running into memory or compute issues.

If you are looking for something that is more Postgres flavored (meaning we're just an extension to it so you get all the good stuff of Postgres such as JSONB, PostGIS, etc.) then we hope we'd be a good fit. And we run a managed service on top of AWS as well (https://www.citusdata.com/product/cloud) built by the team that built Heroku Postgres. If curious on pricing you can find it at https://www.citusdata.com/pricing/


It would be very interesting to have your product at the 200-300$ pricepoint. Currently, the lowest tier starts at almost 2000$ per month for the high availability version.

I'm not trying to compare on a per-mb level, but it would be nice for smaller scale workloads.


Helpful feedback, we do have a development plan for $99, but it's not really intended for production workloads. If you only have 10 GB of data we'd heavily recommend going with something like RDS or Heroku Postgres. At that amount of data single node Postgres works great.


I really like your attitude towards something like this, when your product would be an overkill for a use case and you just recommend a different product. I also really like your blog posts about Postgres, we use it a lot for developers explaining a bunch of internals, like the one with how to paginate in Postgres.


Link to blog post?


I believe this is the one they're referring to on pagination: https://www.citusdata.com/blog/2016/03/30/five-ways-to-pagin...

Though hopefully you'll find many more useful ones about Citus and Postgres broadly on our blog: https://www.citusdata.com/blog/


RDS is not single node - its multi-AZ replicated. And that's what we are paying 300$ instead of 99$ for.

Imagine.. RDS is literally the ONLY place where you can buy a 10 GB data multi-AZ replicated, snapshotted and managed postgresql.

Its pretty much a monopoly, now that Google seems to have officially closed the book on ever supporting postgresql.


> Imagine.. RDS is literally the ONLY place where you can buy a 10 GB data multi-AZ replicated, snapshotted and managed postgresql.

Not true. I was looking for a hosted Postgres provider and discovered these two:

https://aiven.io/postgresql (tried it, worked excellently)

https://www.elephantsql.com


> now that Google seems to have officially closed the book on ever supporting postgresql.

I would be hesitant to say that this is a fact.


> Its pretty much a monopoly, now that Google seems to have officially closed the book on ever supporting postgresql.

Uh, how? I wouldn't be surprised to see a Cloud SQL-like managed Postgres service from Google.

While there's obviously some overlap in the potential market for any relational datastore service, Spanner doesn't really overlap with a cloud Postgres service as much as Cloud SQL does.


I would be. its been years and google and has been building out various pieces of infrastructure around mysql - including cloud spanner.

the issue is that the migration path of self hosted mysql to cloud sql to spanner is pretty well defined. I dont see postgresql being strategically important or relevant to google for anything.

if I was a startup deciding on my database, there's a lot less compelling reasons to choose postgresql from the point of view of long term viability. hell, I can pretty much do a back of the envelope calculation on how much will it cost me to support 100 million users on mysql.

Is it safe to think that Evernote and Snapchat - startups who are giant success stories - are google mysql hosted? (in some form.. maybe even spanner)

So uber, snapchat, Google, Evernote and a clear cut path for upward scale.

I have very less hope for postgresql on google cloud.


> its been years and google and has been building out various pieces of infrastructure around mysql - including cloud spanner.

What does Cloud Spanner have to do with MySQL? It's neither API nor SQL-dialect compatible with MySQL. If there are MySQL bits used somewhere in the implementation, they are well hidden, and irrelevant to users.

> the issue is that the migration path of self hosted mysql to cloud sql to spanner is pretty well defined.

So what? Were there a Cloud SQL-like Postgres offering, the same would be true; Spanner is no closer to MySQL than Postgres. (If anything, it's SQL dialect is a little closer to Postgres's dialect than MySQL's, though not so much that you'll get away without doing substantial conversion going from either.)


You keep saying this - have you actually talked to them? Find some PMs and send a few emails. Postgres is definitely in progress.


>Google seems to have officially closed the book on ever supporting postgresql.

Er, that's a bet I'd strongly suggest you didn't make.


Right? Apps written to use Postgres aren't just going to be re-written to use Spanner.

If anything, hosted Postgres from Google Cloud will be priced in a way that makes Spanner some what more attractive, as a way to get conversions to Spanner in the long-run.


There are several managed hosting companies that will run Postgres (and other databases) for you on public clouds. Compose, Aiven, ElephantSQL, Database Labs, Heroku, etc. There are all kinds of price points and GCP is working on supporting postgres internally.

How many nodes are you looking to run for $300/month? Unless you have more than 150gb/node of data, you don't really need a distributed database which is what Citus is for.


not at the price point that RDS is at. the starting cost of a multi - az deployment is lower. For a startup thats just starting out, it is the best and safest choice. Even heroku, but im not very sure about its reliability versus rds.

Please note what im paying for is availability and reliability.. not for a database per se.

And im not even talking Aurora. That stuff is going to blow every other price point out of the water at probably higher reliability metrics.


RDS is only multi-AZ if you want it to be (and are willing to pay twice as much).


As someone who works (in part) in the MS SQL field, is it irrational to be a bit worried about the effects some of these platform advances might have one one's career?

For example, being a MSSQL performance tuning expert requires years of experience and probably pays very well, but just the other day I read an anecdotal story where someone switched a large BI database to use columnar indexes, allowing them to replace very complex (extreme manual tuning to achieve acceptable performance) queries with just standard SQL with comparable performance.

How long until the scale, pricing, and now transparent & full(?) sql compliance offered by these cloud platforms starts to make traditional RDBMS platforms a niche platform?


Microsoft has a history of sales and support that will allow them a certain longevity. They also have less "brand hate" than Oracle. I dont think MSSQL is going to be like Sybase any time soon, but I probably wouldn't focus on that stack starting now if you are into the startup or california scene. For many places in the USA, MS is the way to go.

EDIT: Also, most DB users don't need global-scale databases.


Not sure if this is what you intended, but you are aware that SQL server was developed in partnership with Sybase until the mid-90s (when they were substantially the same product) right?


Neat history! I was not aware of that! I meant to imply that sybase's rdbms offering is no longer a big player and I would not want to bet the future of my career on being an expert in it.


> ... but I probably wouldn't focus on that stack starting now if you are into the startup or california scene.

That is not necessarily a foregone conclusion. In tandem with their marketing of Azure, MS is pushing SQL Server on Linux heavily [0].

"SQL Server is Windows-only" is no longer a valid argument to choose another RDBMS if a startup uses lots of MS tooling but deploy on Linux servers.

[0] https://www.microsoft.com/en-us/sql-server/sql-server-vnext-...


I'm not saying MS is going to abandon the platform, but to me it seems entirely possible that "very soon" these cloud platforms using cheap, shared, commodity hardware might be so affordable and technically capable that it might be a no brainer choice unless you have a very good reason to use MSSQL (kind of the opposite today, on-prem by default, cloud if necessary).


What it will do is encourage MSFT to drop the cost of MSSQL licenses in Azure.


> As someone who works (in part) in the MS SQL field, is it irrational to be a bit worried about the effects some of these platform advances might have one one's career?

There's a reason why regulated (including self-regulated) professions have continuing education requirements; progress happens and you become obsolete if you don't keep up with it.

Just because tech isn't regulated doesn't mean it's any more sensible to expect to remain valuable without keeping up with progress in the field.

That being said, MSSQL experts will likely have good-paying opportunities for quite a while, for the same reason thats then case for any well-established enterprise technology: lots of systems are going to be around using it long after it has become distressingly uncool to spend time learning.


I write software as a developer. This is how I earn my livelihood.

Four years ago, I determined that while development work might seem to be near the top of the food chain, there will at some point where my work will be replaced by AIs.

This is not so different from how word processors replaced the specialist job of typesetters. Word processors make "good enough" typesetting. You can still find typesetters practicing their craft; the rest of us use word processors and don't even think about it.

At the time, I was learning to put the Buddhist ideals of emptiness and impermanence to practice, and to become more emotionally aware: the _main_ reason I had thought I would never be replaced by an AI writing software has more to do with wishful thinking and attachment than any clear-sighted look at this.

I also made a decision to work on the technologies to accelerate this. Rather than becoming intoxicated by the worry, anxiety, and existential anguish, I decided try to face it. Fears are inherently irrational, but just because they are irrational does not mean it is not what you are experiencing. Fears are not so easily banished by labeling them as irrational. Denial is a form of willful ignorance.

Now, having said all that, whether our tech base will come to that, who can say?

Since then, I have been tracking things like:

Viv - a chat assistant that can write it's own queries

DeepMind's demonstration of creating a Turing-complete machine with deep learning using a memory module.

I watched a tech enthusiast write a chat bot. He does not write software professionally. Talking with him over the months when he tinkers with in his spare time, I realized that in the future, you won't have as many software engineers writing code; you would learn how to _train_ AIs when they become sufficiently accessible to the masses. Skills in coaching, negotiation, and management becomes more important then some of the fundamental skills supporting software engineering. And like typesetting, I can see development work being pushed down the eco-ladder.

It's not surprising to me to see that Wired article about how coding becoming blue collar work. And even that will eventually be pushed down even further.

It's not surprising to me about Google's site-reliability engineering book, branding, and approach. I have done system admin work in the past, and I can already see traditional, manual sysadmin work being replaced.

It's easy to get nihilistic about this, but that isn't my point here either. I know the human potential is incredible, but I think we have to let go of our self-serving narratives first.


I find this fascinating. There are a few ideas that are at play. One is the march of progress seeking to automate everything. The rationale of automation is to improve productivity. But what happens when everything is automatic? I don't see a corollary being played out at the moment. There are a small number of people reaping the benefits, and huge swathes of the population being marginalised and disenfranchised as a result.

The second idea that interests me is this idea of very high technology. It is built upon layer after layer of very clever tech year after year that I wonder how long it would take to start again from scratch if some disaster rendered a large part of one of these layers unusable.

For instance, if you were on a desert island, could you (would you want to?) build some piece of tech? An electric generator would be useful, perhaps. How long would it take to build? You'd need knowledge, raw materials, plant, fuel etc. It's not an easy solve. And that's way down the tech stack before you start talking about AIs. I suppose what I'm saying is, that the AI layer is based upon such high tech, that is inherently fragile, because it is so hard to do.


> There are a few ideas that are at play. One is the march of progress seeking to automate everything. The rationale of automation is to improve productivity. But what happens when everything is automatic? I don't see a corollary being played out at the moment.

I don't know! :-D

I don't know what society would look like from a purely technological point of view. From a spiritualist point of view, though, it could either go very well or very badly. When everything is automated, would people have enough time and space to really start asking the really big questions? Or would it accelerate and intensify existential anguish?

> There are a small number of people reaping the benefits, and huge swathes of the population being marginalised and disenfranchised as a result.

Yeah. Arguably, this has already happened.

> The second idea that interests me is this idea of very high technology. It is built upon layer after layer of very clever tech year after year that I wonder how long it would take to start again from scratch if some disaster rendered a large part of one of these layers unusable.

The stuff of sci-fi :-D Among them, alt-history novels (what happens when someone drops into a lower-tech era; you'd have to start from 0 ... literally, 0, as in Arabic numerals).

Open Source Ecology is trying to preserve some of this tech base. I find their aims awesome, though I am not sure how effective it is.

The flip side are things being spoken from well outside the techno-sphere, (for example, shamans and mystics) It is the perspective that the further evolution of human consciousness will, at some point, no longer require a technology or artifacts. Technology seen as the last crutch. The collapse of a high-technic civilization then sets the stage for a removal of that crutch, and humans learn to stand with two feet (so to speak).


Anything that you can easily specify and describe in detail can be automated. In practise the world is filled with computers that need programming to cope with ever changing chaotic actions of our users. Personally I'm long on software developers, despite being very excited by how AI has blown up lately.


Agree. I remember someone advising me against getting a degree in computer science, back in 2000. Argument was -- look at MS Word. what else you want to add to this? It has more features than you need.

Not a fair argument against the point made above, however, I believe we will find the next big challenge for software to solve as soon as traditional problems are commoditized/automated and considered solved. Also, just knowing how to code is not going to be enough. You must complement it with domain expertise to solve challenging unsolved real world problems.


I would say: quite long enough, but the scale/importance of traditional deployments will be going down through that time. If you're looking around the market regularly, you should have ample time to notice that gigs are not what they used to be, so maybe it's time to change area.


People are abusing databases like MSSQL to do things they may not be not good at. Large scale analytics is an example where databases like Infobright give amazing performance.


It'll be interesting to see how well customers will adopt this. When I was at one of the two companies you mentioned above, we tried adding global snapshots (a la TrueTime, which is the real innovation in Spanner not the clocks) and demoed it to our DBAs/MVPs. They didn't understand what on-earth was going on. Wanted something that worked with existing clients. We just gave them classic 2PC and they went home happy. I think that's the reason why Oracle will keep on chugging. There just aren't that many workloads that need this sort of scale. It is real cool technology though and we always used to wonder why Google wasn't offer Spanner as a service.


As a bit of a veteran in the database industry, I concur (at least about the impact on Oracle's database business). There is a lot of pent-up demand for anything that offers distributed consistency.

We've been seeing this demand at Fauna. FaunaDB offers distributed consistency, based on Raft and the Calvin protocol instead of depending on specific networking and clock hardware. We've seen a big part of our appeal is the ability to run FaunaDB across multiple cloud services.


Wait, Fauna uses the Calvin protocol? Would you mind linking a white paper? I didn't realize it was in use outside Calvin/CalvinFS.


We have kept it under wraps. The whitepaper will be ready next month most likely.


Awesome! I'll keep an eye out.


What is the monetisation plan? Purely SAAS with on-premise or an open source version with support like postgres/mysql?


The serverless cloud is pay-as-you-go. There is no minimum spend, unlike Spanner's $1000 per month (apparently). And it's cheaper than operating any open source on cloud hardware.

On-premises is licensed by core.

We have a developer edition you can use on your local machine, but we don't currently have plans to open source FaunaDB itself.


Where did you read that Cloud Spanner has a $1,000 per month minimum spend? I can't seem to find any mention of this.


Minimum 3 nodes * node pricing of $.90/hour = 30.924*30 = 1944/month.


Where is this 3-node minimum mentioned? URL please?

Edit: Looks like maybe you're referring to the recommended 3 node minimum in production mentioned at https://cloud.google.com/spanner/docs/instance-configuration


While I'm not familiar with Spanner's inner workings, I would guess that they recommend 3 instances for quorum establishment in case a region becomes unreachable. If that's the case, using fewer than 3 instances could cause major problems.


Since Spanner currently only supports single-region deployments, it clearly isn't recommended as protection against a region becoming unavailable.

It may be recommended as protection against an availability issue on an instance, though, which is, after all, a big reason why you'd want a distributed DB in production.


I suppose the loss of a region doesn't apply (yet), but yes, the quorum requirement would still apply even if you only had instances in a single region.


Even within a single region you can put three nodes in different zones.


>well that would be a lot of choice for the developers!

A sad choice though. The centralization of computation is likely not a good thing in the long run.


The movement from ownership to renting on the web is absolutely terrifying to me. Within the span of a few years we've gone from owning our technology to renting it out from a big players for monthly fees that we cannot completely predict or control.

The advantages of owning your own hardware will never go away, but soon this will be made quite intentionally impossible as the big players coalesce and continue building their walled gardens.

This is already happening. All the big players own their hardware and rent it out to everyone else, while trying to convince everyone it's not worth owning your own hardware at the same time.

These companies have already begun closing off server platforms by developing custom hardware and software systems that cannot be bought for any price, only rented. These systems represent a new breed of technology with unbreakable vendor lock in.

Theses same companies compete with each other and countless other companies across the space. Take for example a start-up that wants to run their own app store. Google, Amazon, and Microsoft all run app stores. Where will this company go for cloud services? Their only big name options are to host their software on the hardware of a direct competitor. Their host has full visibility on how their system works, and control over the pricing and reliability of their machines.

It's laughable to think their "cloud partner" will give them any chance to compete if they enter the same market.

We've seen UEFI BIOS and un-unlockable mobiles enter the market in droves the last few years. A lot of new PC's can't run anything except windows. A lot of new phones can only run the carrier's version of android. We have all these general purpose CPUs that can no longer run general purpose programs because "security", and a lot of lobbyist pushing to make it actually illegal to run your own software on these with "anti tampering" laws, again for "security" . Soon the big guys (same companies, MS and Google) will make it impossible to run your own software on any reasonably inexpensive devices and the walled market will be complete.

Mark my words, I've never seen an industry with a couple big players where growth and innovation doesn't eventually turn into collusion, higher prices, and market stagnation. Once MS, Google and Amazon have their slice of the pie and they've killed off everyone else, we will see the death of general purpose computers and mobile devices. Everything you buy will be "android computer" "windows computer" and "apple computer". Anything general purpose will be massively more expensive because individual companies can't get the kind of volume discount of the giant behemoths that increasingly control large swaths of the world's computing power. We've already seen the endgame, with Amazon trialing an "on premesis" version of their compute platform which is basically a super locked down server that you can't buy, only rent endlessly. The future of on premesis will be a cloud in a black box if these companies have anything to do with it. Why? Because once they've got you locked in it makes no sense to sell you anything for keeps. Why keep improving their product so you buy the new version when they can just make it incompatible with everything else and force you to rent it forever, for whatever price they feel like charging?

One day running your own servers will be like running your own ISP . Massively impractical because the free market has been manipulated to the point that it effectively no longer exists


> One day running your own servers will be like running your own ISP . Massively impractical because the free market has been manipulated to the point that it effectively no longer exists

What? People use cloud computing because it already is massively impractical to run your own servers. Hardware is hard to run and scale on your own and experiences economies of scale. This principle is seen everywhere and can hardly be viewed as something controversial. Walmart for instance can sell things at a really low price because of the sheer volume of their sales. Similarly, data centers also experience economies of scale.

As someone who cares about offering the best possible, reliable user experience, cloud computing is absolutely the next logical step from bare metal on-prem servers. When your system experiences load outside the constraints of what it can handle, a properly designed app that has independently scaling microservices horizontally scales.

Even if you had the state of the art microservice architecture running on a kubernetes cluster on your own hardware, you still wouldn't be able to source disk/CPU fast enough if your service happens to experience loads beyond what you provisioned.

And there is the rub, buying your own hardware costs money, and no one wants to buy hardware they may not ever use. Another advantage of cloud computing.

You are seeing the peak of free market right now, because of cloud computing, which enables people with little upfront cash to invest to form real internet businesses and scale massively.

You think a game like Pokemon Go can exists and do the release they did without cloud computing?


"Even if you had the state of the art microservice architecture running on a kubernetes cluster on your own hardware, you still wouldn't be able to source disk/CPU fast enough if your service happens to experience loads beyond what you provisioned." That basically means you never planned. As everyone moves to cloud what makes you think AWS, Azure wont have same issue. If entire region is down do you think other regions can handle the load. If you think so you're kidding yourself. Unless you have business where you dont know your peak number then cloud does not matter.


You can plan all you'd like, failures happen not necessarily due to poor planning but because in real life, shit happens. Pokemon Go for instance experienced like 50x the amount of traffic they planned for.

Secondly, software companies like Microsoft, Google and IBM might know a thing or two about running data centers. Due to economies of scale, these companies are inherently in a better position to supply hardware at scale.

> If entire region is down do you think other regions can handle the load. If you think so you're kidding yourself

Netflix routinely does just this to test the resilience of their systems. They pick a random AWS region, and they evacuate it. All the traffic is proxied to the other regions and eventually via DNS the traffic is routed entirely to the surviving regions. No interruption of service is experienced by the users.

Here's a visualization of Netflix simulating a failure on the US-east-1 region and failing over to US-west-1/US-west-2

https://www.youtube.com/watch?v=KVbTjlZ0sfE

The top right node is the one that fails. As the error rate climbs, traffic starts getting proxied over to the surviving nodes, until a DNS switch redirects all traffic to the surviving nodes. Netflix does this monthly, in production. They also run https://github.com/Netflix/SimianArmy on production.

The cloud enables fault tolerance, resiliency and graceful degradation.


I think you missed the point, Netflix evacuating a region is not the same thing as that region failing. If the whole region goes down, their (AWS's) total capacity just took a major hit and unless they have obscenely over-provisioned (they haven't), shit is going to hit the fan when people start spinning up stuff in the remaining regions to make up for the loss.


>The cloud enables fault tolerance, resiliency and graceful degradation

No, tooling to failover and spin up new instances does that. An enterprise with 3 data centers can do that.

"the cloud" is just doing it on someone else's hardware.


Have you run your own servers in a colo? I've done it myself.

One person, with maybe 3 hours a week of time investment after a few weeks of setup and hardware purchase. Using containers I can move between the cloud and my own servers seamlessly, and long as I never bite the golden apple and use any of the cloud's walled garden "services" like S3. If I need more power I can spin up some temporary servers at any cloud provider in a few hours. For me the cloud is a nice thing because I don't use too much of it. If AWS disappeared tomorrow it would be a mild inconvenience, not devestating like it would be to many newer unicorns.

Go ahead and try to use the cloud you're paying for as a CDN or DDoS sheild, or anything amounting to a bastion of free speech. You'll quickly find out that your cloud provider doesn't like you to use all the bandwidth and CPU you pay for, and they don't like running your servers when they disagree with your views. They quietly overprovision everything pulling the same crap as consumer ISPs where they sell you a 100mbps line and punish you if you use more than 10 of that on average. That's the main reason the cloud is so cheap.

Hardware is cheap, colo's are cheap, software is largely easy to manage. The economy of scale they enjoy is from vendor lock-in and overprovisioning more than anything else.

Is it really that hard to double the amount of servers you own every few weeks? No! If you're using containers or managed KVM you can mirror nodes basically for free over the network as soon as the Ethernet is plugged in. Your time amounts to what it takes to put the thing in a rack, plug in the Ethernet, and hit the "on" button. Everybody in SV land thinks you have to use cloud to "scale massively" but they forget that all of today's technology behemoths were built years ago when the cloud didn't exist. Oh yeah, they all still run all of their own hardware too and have from the early days. Using their model as a template, you should own every single server you use and start selling your excess capacity once you get big enough.

Did you ever read about how Netflix tried to run their own hardware but can't because they have so much data in AWS that it would basically bankrupt them to extract it? Look at how these cost models work. Usually inbound bandwidth is extremely cheap or free but outbound is massively more expensive than a dedicated line at a datacenter, 50-100 times the cost if you're saturating that line 24/7. The removal fees from a managed store like S3 or glacier are even more ludicrous. The cloud is like crack and as soon as you start using it more than a few times a year you will get locked in and unable to leave without spending massive $$$. Usually companies figure out this shell game once they're large enough, but by then it's far too late to do anything about it.

Why are they marketing these things so heavily to startups? Because lock in is how they make their money. They make little or nothing on pure compute power, but since you don't have low level hardware access they can charge whatever the hell they want for things like extra IP's, DDoS protection, DC to DC peering, load balancing, auto scaling. You give massive discounts to new players using these systems and inevitably some of these will become the next Uber or Netflix. Then you are free to charge whatever exhoribitant rates you please once it's so impractical to move that it would require a major redesign of the business.

I see it a lot like franchising. By building on Amazon's cloud services you become "Uber company brought to you by Amazon". Like franchising, your upside is limited because any owner with a significant share of total franchises will begin to put pressure on the service owner itself.


To be honest, you sound like conspiracy nut hell bent on hating the Cloud. Maybe you should try taking a deep breath, and try to open up to the possibility that the Cloud is actually a good thing, and Cloud providers aren't the illuminati trying to "lock you in". Well maybe they are. Of course every cloud provider wants you to use their services.

But any "lock in" is totally up to you. Take a look at this: https://kubernetes.io/

You can architect your system in a way that it'll run on any cloud provider. All the major Cloud Providers support kube for orchestration.

To be honest I don't think you know what you're talking about. You should refrain from making uninformed opinions on hacker news, especially on a throwaway.


Did you ever read about how Netflix tried to run their own hardware but can't because they have so much data in AWS that it would basically bankrupt them to extract it?

Where did you read this? You can have Amazon send you a truck full of hard drives. I doubt it costs more than Netflix can afford.


Nevermind, I misremembered the story I read about them. They moved the main site to AWS with the huge omission of their movie streaming system. Their own Open Connect servers are far cheaper to use for this becuase of massive AWS outbound data costs.

Also, the truck is for data in, not data out. Getting data out of AWS is far more expensive than putting it in. That's the lock in.


The 'huge omission' is by design.

Also, the truck is for data in, not data out. Getting data out of AWS is far more expensive than putting it in. That's the lock in.

This is also not true. The bulk transfer service is bi-directional.


The Open Connect servers are for the edges, not the core.

They cache popular content close to the users, they don't manage their catalog at the edges.


You did not ever own your own globally consistent, massively scalable, replicated database. The fact that you can now rent one by the hour is strictly an improvement for you, if you need that kind of thing.


Cassandra also does that without requiring the "magic" of a system you can only get from a single vendor and never buy. At the same time these walled gardens have come up free software has grown to fill the gaps


Cassandra is sort of a Bigtable without transactions. It is not comparable to Spanner at all.


Spanner is unique in a lot of ways, but it still trades off consistency for speed.

The most unique thing about spanner is the use of globally synchronized clock timestamps to guarantee "comes before" consistency without the need to actually synchronize everything.

There is nothing stopping startups and open source developers from building the same thing in a few years. The missing ingredient is highly stable GPS and local time sources which will hopefully be available on cloud instances sometime soon. This is a new piece of hardware so it will be interesting to see if cloud providers make one available or use the opportunity to sell their own branded "service" version you can't buy. Unfortunately I think we'll see the latter far before the former, it it ever even exists. Without a highly stable timesource doing what spanner does will be completely impossible.

Yes spanner is special right now but that's even more reason to not go near it. Google has a complete monopoly on it, the strongest vendor lock in you can possibly have


> This is a new piece of hardware

Only "new" in the sense that it is currently not commonly offered, the devices themselves have been available for ages. (If you are a large enough customer you apparently can get at least some colo-facilities to provide you with the roof-access and cabling needed for the antennas). If cloud providers make precise time available I don't see much potential for locking you in with their specific way of providing it, as long as it ends up as precise system time in some way.


I'm saying I doubt they will ever offer it precisely because it will conflict with their paid offerings. The fact that it takes its hardware is a great excuse to not give your customers the option.

I know GPS time sources have been available forever but a fault tolent database needs a backup. The US GPS is incredibly reliable but there have been multiple issues with both Glonass and Galilio.

It sounds like Google has an additional time source making this possible, probably a highly miniaturized atomic clock, possibly on a single chip. There's no way they're running on GPS alone


Yes, they clearly say that they use atomic clocks in addition, but that's commercially available as well. Atomic clock for frequency stability short- to mid-term, GPS to keep it synced to global time. E.g. in many cases, mobile-phone base stations contain just such a setup, and the data-center versions should fit in a few HE.


And then all you need is a team of 12 full time SREs to manage it.


A system build on top of it? Possibly, but thats the trade-off if you don't want to pay for/be lock-in to somebody else running it. For just the timing stuff: not really. Of course it adds complexity, but these things are established and should be quite stable.


The absolute level of computation available isn't changing at the consumer level. What's happening for the next decade is the destruction of businesses hosting their own IT infrastructure and moving it to a couple of core centers.

So, the computational "Gini index" is increasing, but no one is being thrown into computational poverty.


>What's happening for the next decade is the destruction of businesses hosting their own IT infrastructure and moving it to a couple of core centers.

Yes, and this will be disadvantageous over the long run for people that want to run things themselves. Ultimately companies like AMD/Intel go where the big money is at. As things centralize further and further, there will only be 3 customers they care about in the server market.


This isn't true. They desperately want to enable users outside of the big cloud vendors as they have very little price leverage over the big vendors


But that won't matter if users keep moving to the cloud. They will be forced to just work on whatever Amazon/Microsoft/Google want.


> The absolute level of computation available isn't changing at the consumer level.

Maybe not, but consumers increasingly use centralized computation resources. I would guess that by now most applications used by consumers run in their web browser, such as Facebook.


The parent comment doesn't seem to specify "consumer level" and the loss of businesses having their own infrastructure is equally troubling. Everyone is putting a lot of eggs in a very small number of baskets.


I would disagree about the character of the situation. This isn't about people putting eggs in a few baskets, it's that it's more efficient to have centralized chicken coops instead of every family in the world owning their own chickens.

Now, you could play with that analogy further and see some issues as well, but I don't think the issue here is centralized failure; all these data centers/"clouds" are at least good. The Cloud is about businesses focusing on core business and not supporting functions.

[Disclosure, I work on the Google Cloud team, I'm biased]


>focusing on core business and not supporting functions.

Having a devops team with the necessary expertise in Google Cloud or AWS is still a supporting function. You've just traded one skill (managing physical servers) for another (managing proprietary virtual resources).


But hopefully a smaller team, and one that keeps diminishing in size over the years if the trend continues. At least for the same level of service (in availability, security, etc.).


Monocultures are efficient, but not healthy ecosystems in the long term.


Let's look at your metaphor. It's more efficient for the raising of a large overall number of chickens. It's less efficient when I need fast access to a single egg.

Hence we get caching. There's the farms, then the inbound warehouses, then the distribution centers, then the grocer, then our refrigerators by the dozen or dozen and a half. When your local cache is empty of eggs, though, it requires a trip back out to the grocer to get an egg even if you need nothing else that trip. Then you generally have to buy at least half a dozen if not a dozen or more eggs just to get the one you wanted.

If I have my own couple of hens, I can go out into the yard and get an egg. If that's the whole of my fetch list, it's much more efficient for this single egg to have the hens laying right out back.

This whole few baskets metaphor breaks down from another point of view, though, when we consider that by the very nature of using a globally distributed hosted service we're actually eliminating a single basket problem. Yes, there's not much choice among just Google, Amazon, and Microsoft. (That they are the only options is a bit of a strawman, but lets grant this one legs.) However, putting just your own employees in charge of all your infrastructure in just your own datacenter(s) in just PostgreSQL or just MySQL is another single-basket problem. Spreading it out so that someone else gets to manage the hardware and the service and replicating your data widely within that service is from that point of view more baskets. You get more datacenter baskets, more employee baskets, and more software baskets. Using standard SQL means you can move among compliant software later, too, so you're not as tied to those baskets.

Now, back to your coop analogy. What's stopping me from having my application talk to Cloud Spanner and a local database proxy (or a work queue that sits between the app and the DB or whichever) so I can use Google's reliability for transactions and my local cached replicant for query speed when I'm querying older data? Why can't I keep a few eggs around?

Also, why would I be scared of Google or Amazon "having my data"? Why would I put sensitive data into my own database in plaintext and then replicate it among multiple datacenters that way?


> it's that it's more efficient to have centralized chicken coops instead of every family in the world owning their own chickens.

Only if the owner of the chicken-coop has everyone else's best interests in mind. Protip: They don't.

The Cloud isn't about efficiency, it's about data control. Getting people's systems and data into Google/AWS/etc helps with data mining, vendor lock-in, etc. Often times that can be efficient, but also it often isn't.


That's like being sad about the emergence of banks, because everybody's money is being kept in a small number of vaults instead of under each one's mattress.


A good point, but there is an up and downside to everything. The centralization of IT does impact civil liberties and possibly innovation - unlike FOSS and other local systems, aspiring hackers can't tinker with Facebook code and see how it works.


I don't understand how Facebook relates to this, as they don't rent their cloud. Aspiring hackers couldn't tinker with MS Word 2000 code either.


> Aspiring hackers couldn't tinker with MS Word 2000 code either.

They could tinker with the binaries, something many did with game binaries. But your point is well taken; open source is also very valuable to innovation.


Web apps were also very useful for learning JS and browser APIs, before everybody started minimizing and obfuscating it. I learned how to write a rich-text editor just by looking at the code of Hotmail's email editor.


Fair enough, but think of that free and open stack: (layer 1), Ethernet, IP, TCP/UDP, HTTP/SMTP/DNS/etc, HTML/JavaScript. How many cut their teeth on those?

The apps on top, Facebook, Snapchat, etc., are not so open and much of what they do is out of reach from the user.

Also, I meant to add above: People could tinker with data files (e.g., Word docs), configurations, etc. The whole system was local and accessible. You could write local code, such as VB or for Windows, that integrated with those systems.


Creating 3 massive banks that the entire world gets to choose from would be terrible.


Not sure the banks:mattresses and cloud-companies:IT-companies ratios are that different.


That strategy resulted in the Great Depression and later 2008 crises. Damage was so high that country had to be rescued by the federal government. So, banking is a decent example of how such consolidation into private hands can go wrong. Now we just apply that to IT services and data.


That's a ridiculous argument: Banks started being a thing at the end of the Middle Ages. The Great Depression and the Great Recession were not caused by banks emerging, nor by people putting their savings in them.


Not emerging. Just being themselves with all their schemes and an economy dependent on them. A distrust of banks and their schemes at a national level might have reduced their ability to cause those problems. On top of the smaller stuff such as them delaying deposits or withdrawing stuff for bogus reasons.


Putting your savings under the mattress instead of in a bank account wouldn't have prevented the Great Recession. It was caused risky mortgages (debts, not savings) being sold as low risk from bank to bank, and then defaulting.

Putting your savings under the mattress instead of in a bank account wouldn't have prevented the Great Depression either.

The only thing it would have accomplished is making your savings easier to steal.


Storing gold or other valuables instead of Gederal Reserve notes for sale or bartering wouldn't have helped during Great Depression? I havent heard the angle that there was nothing to barter with on top if worthless dollars.


We've already been through it. People eventually abandoned mainframes for everything they could. Many of the current customers are interested in better solutions but just stuck due to lock-in of piles of COBOL, etc.


> The centralization of computation is likely not a good thing in the long run.

I agree. It only makes sense if you need special data for statistics, AI training, etc.

In all other cases the classic way of programming on pc and notebook is smarter. If you do everything in the cloud, what if you lose Internet connection? I had that experience several times over the last years.


My internet connection is more stable than my computer.


I'm not sure that's widely true. Consider:

* Most Internet usage is via smartphone

* Computers are much more stable than they used to be

* Much of the world lives in places with less stable connections

* The most expensive spec in an Internet connection is availability. You can get a low-end 15 Mbps connection with no availability guarantee for $40/month; a T1 is one-tenth the speed and costs 10 times as much (all numbers are rough estimates).


You mentioned smartphone. Once your computer dies you can use your smartphone. You can use your neighbours or colleagues computer.

It's only so many heavy industries that definitely need some sort of local infrastructure (probably not master tho) to be locally.


>the only growth market for Oracle right now is their hosted database service, and whoops Google has a better one now

It could end up that way, but lacking INSERT and UPDATE will likely limit this to a niche market for now.


I thought SQL compatible means INSERT and UPDATE available. Why aren't these statements available ?


Technically those are dml statements.


Amazon's Aurora databases seem to be solving the same problem, and are MySQL or Postgres compatible to boot.


Aurora is very cool but won't help you much after you vertically scale your master and still need more write capacity. With Cloud Spanner you get horizontal write scalability out of the box. Critical difference.


So if I'm understanding you, with Aurora all writes go to one master and you're constrained by the biggest instance AWS offers. Is that right?

Do you have a sense of what that limit is?

There's a pretty big price difference between Spanner and Aurora at the entry level so it's useful to explore this.


> Do you have a sense of what that limit is?

Per their pricing page[1] it looks like the largest instance available is a "db.r3.8xlarge", which is a special naming of the "r3.8xlarge" instance type[2] which is 32 cpus and 244gb of memory.

That's a hell of a lot of capacity to exhaust, especially if you're using read replicas to reduce it to only/mostly write workloads. Obviously it's possible to use more than this, but the "sheer scale" argument is a bit of a flat one.

[1] https://aws.amazon.com/rds/aurora/pricing/ [2] https://aws.amazon.com/ec2/instance-types/#r3


Wouldn't the write master be I/O-bound, rather than CPU- or memory-?


I disagree, the "sheer scale" argument is not flat. The fact that one can scale horizontally and the other can't is significant.

Let me present a quote to you: 512 kb ram ought to be enough for everybody


You can disagree on that if you'd like, but note that I explicitly acknowledged the possibility of exceeding these limits. In my opinion, for most cases/workloads, it's highly unlikely that you will and designing for that from the outset is a waste of time and resources.


Yes, Aurora has a single write master, though it does have automatic write failover -- i.e. if the Aurora primary dies, one of your read replicas is promoted to the primary and reads/writes are directed to the new instance. That does constrain your primary's capabilities to the largest instance size (currently a db.r3.8xlarge).

I don't have a good idea what the upper limit is for an Aurora database setup.


How does Aurora know that the primary is dead? Automatic failover is problematic in a distributed system.


AWS uses heartbeats for detecting liveliness. If x heartbeats fail the failover procedure is started. Generally 10s - 5minutes. In practice (for me) the failover has been less than 15s.


My concern was more around split brain. If you fail over while the write master is simply unreachable, pain results.


Aurora's read replicas share the underlying storage that the primary uses, so AWS claims that there's no data loss on failover. They also claim -- and I've never heard anyone say they were wrong -- that Aurora failovers take less than a minute. So the pain should be limited to under a minute of lost writes, which most applications can handle (with an error). It can still be painful depending on the application.

See here for more info: https://aws.amazon.com/rds/aurora/faqs/#high-availability-an...


Yeah, the latency on that failover isn't specified.


Do you mean the amount of time it takes to initiate a failover or the amount of time for a failover to complete?

For the former, I don't think they specify beyond "automatic".

For the latter, "service is typically restored in less than 120 seconds, and often less than 60 seconds": http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora...


That's a pretty good cutover, but as you say, they should also include the time needed to detect a failure and initiate the transition.


Amazon provides a testing methodology here: https://d0.awsstatic.com/product-marketing/Aurora/RDS_Aurora... which might be useful to explore when benchmarking the two services against each other.


Aurora is a 'better MySQL mousetrap', IMO.

This is a globally-available, nearly-CAP-beating datastore that powers one of the biggest websites on the internet.

It's not quite apples and oranges, but this is definitely a different problem they are solving.


That's vague. AWS also powers huge websites and Amazon is recommending Aurora as the "default choice" for most workloads.[1] There are certainly significant architectural differences but I would say we can definitely make a direct practical comparison.

[1] http://www.computerworld.com/article/2953299/cloud-computing...


If Aurora powers huge websites, spanner is for ginormous websites. Think a multiplier to netflix's database needs.


Curious to know what are Netflix's needs for relational database?

Doesn't strike me as a business with complex logic.


Netflix mainly uses Cassandra as their database.

And their needs are reasonably complex. They use machine learning and big data analytics to generate the list of videos that you should be watching. In order for those to work they need to capture a whole raft of end user metrics e.g. at what point you paused video X.


I'd assume they keep track of who watches what for their 'continue watching series...' pain.

Netflix was given as an example of scale. I guess for another example, spanner could be used to store every visa transaction


While Aurora doesn't provide true horizontal scalability, the same-node scalability seems so strong it might allow many companies to stay single-node for quite a while.

For example, see this benchmark:

http://2ndwatch.com/wp-content/uploads/2016/09/Graph-3.jpg

from this article:

http://2ndwatch.com/blog/benchmarking-amazon-aurora/

Thoughts?


Aurora's other zone replicas are read-only. Probably no atomic clocks and GPS for time synchronization.

To be fair, Spanner's cross-region service is coming "later 2017".


It is not close to equivalent. But I do want to get a better feel for if Google really has figured how to do basically the impossible. I want to see if this truly scales horizontally but of it does then competitors better hope for a much more detailed paper :)


> It is not close to equivalent.

It's equivalent, with different (unknown) constraints. Aurora is specifically for scaling workloads in the same way. You can say it's horizontal (machine) over vertical (resource) but it's all a matter of accounting.

The big nono is the Spanner pricepoint. I will stick with Aurora for scaling based on traffic I use, over pricey timeslices.

You would have to have quite a load to justify the switch from cheaper de jour solutions right now (AWS). Relying on the few that do, is a risk.


It seems like Oracle could have a play here, working on adapting cloud infrastructure tools for managing on-premises data centers. That keeps them in play for customers who can't put their data in the cloud and those that haven't because they're already Oracle customers.

This pendulum swings. We're pretty near the apex now. A little work on ergonomics and these tools could be turn-key, and back we go to decentralized hardware.


> It seems like Oracle could have a play here, working on adapting cloud infrastructure tools for managing on-premises data centers.

That's what the ExaDatas are supposed to be.


> the only growth market for Oracle right now is their hosted database service

This is not true. Oracle is far more than a database company nowadays in the same way that Microsoft is more than Windows. Oracle has been acquiring high-growth startups at a significant rate.


I was going by this document: http://s1.q4cdn.com/289076952/files/doc_financials/quarterly...

Which has their 'cloud services' doubling their contribution to revenue year over year and licenses losing 50% of their contribution to revenue year over year.

There 'cloud' collateral is pretty opaque though.


Yes, the "cloud" includes all of their non-database offerings too which have been the focus of recent growth/acquisitions.


A lot of Oracle's "cloud" products are just their traditional single-tenant/on-prem software that they offer to run for you in their own environments.

You're still buying the same stuff, but you're outsourcing your dev ops to them on top of it (which may not be a bad thing).


Well the product manager is from Oracle.


How is Google's cloud MySQL better than Oracle?


I didn't downvote you. It is important to note though that the Spanner project isn't related to MySQL and there is some discussion of that in a the stories around Spanner. It would nominally compete directly with Oracle's flagship database product.


Spanner is related to MySQL at Google: a product with Spanner's semantics was required to replace MySQL for some important business operations (https://landing.google.com/sre/book/chapters/communication-a...)

it's actually hard to beat MySQL for a lot of things. i was skeptical about this when I joined google, but as an SRE on the MySQL team around this time, I gained a lot of respect for it.


That is an interesting way to look at it; I have wrestled with MDB[1] while working at Google, it was a ginormous MySQL database (possibly one of the world's largest). And I would characterize Spanner's relationship this way, "If you think you are actually going to build an ACID database that scales, then make sure you can support the MySQL api that MDB uses and we'll see just how well it scales."

I don't know if anyone put it to them that way but as Spanner was just getting started when I left I know that one of its success criteria was to be able to be a scalable replacement for MDB. Given the white paper and other papers on their results, I'm sure it managed that requirement.

[1] MDB, Machine Data Base, used throughout the org but especially in Platforms and SRE to keep track of machines and their parts.


I don't know why you think MDB is ginormous by MySQL standards; it was actually quite modest.

Everything you need to know is here: https://research.google.com/pubs/pub38125.html

it walks through the architecture of the Ads DB, the issues with replacing MySQL, and some of the heroic efforts to implement it via Spanner.

At the time, I asked the team why they used MySQL instead of Postgres (which I prefer) and the short answer was: MySQL replication worked at the time.


Google Cloud SQL (hosted MySQL) is a completely separate product.


[flagged]


Please don't do this here.


Really a CP system but with the Availability being five 9s or better (less than one failure in 10^6)

How: 1)Hardware - Gobs and Gobs of Hardware and SRE experience

"Spanner is not running over the public Internet — in fact, every Spanner packet flows only over Google-controlled routers and links (excluding any edge links to remote clients). Furthermore, each data center typically has at least three independent fibers connecting it to the private global network, thus ensuring path diversity for every pair of data centers. Similarly, there is redundancy of equipment and paths within a datacenter. Thus normally catastrophic events, such as cut fiber lines, do not lead to partitions or to outages."

2) Ninja 2PC

"Spanner uses two-phase commit (2PC) and strict two-phase locking to ensure isolation and strong consistency. 2PC has been called the “anti-availability” protocol [Hel16] because all members must be up for it to work. Spanner mitigates this by having each member be a Paxos group, thus ensuring each 2PC “member” is highly available even if some of its Paxos participants are down."


> with the Availability being five 9s or better (less than one failure in 10^6)

Anyone know how exactly this is defined for them? (Time? Queries? Results?)


In general availability is defined in terms of time.

https://en.wikipedia.org/wiki/High_availability

Five-9s means 5 minutes of downtime per year.


Looks like they define it as "downtime where downtime instance > 30s"?


Well, this is what the spanner website says if you read the fine print:

>> This feature is not covered by any SLA

So I would guess that you don't get _any_ guarantess. Not five nines and not even one nine.


Currently yes, but that's because it's in Beta.


MTBF of 2PC-strapped-to-quorum is no different from MTBF of a 2PC-strapped-to-spof replicas.

MTTR is bounded by reelection latency, rather than replica recovery, although you still may eat a write amplification cost for rereplication.

write amplification is 3-5x of non-quorum-backed 2PC system, depending on replication ensemble size.

google further multiplies write amplification with geo-redundancy, so bump that WA by another 3x+.

it's an insanely high cost to pay for availability, but for an advertising company it's important to count the beans accurately.


You're making some assumptions. For example if you use epaxos for the quorums there is no unavailability due to a leader failure and re-election. Even then, re-election is likely going to be a lot faster than any sort of fault tolerant 2PC coordinator recovery protocol. You're exaggerating the write amplification. The above offering is single region. Google says they do 5 way geo in their various papers/etc, and there's no way they're paying some cumulative 15 copy write amplification as you imply.


I've worked on a storage system that used 9. 15 doesn't seem impossible to me.


epaxos is not general purpose replication. It leaves the problem of reconciling multiple replication streams to another layer. I've heard that they do 20+WA (distributed, before you consider on-disk WA) for much of it. shrug


9x write amplification isn't that high. If you are setup with a master slave configuration, that creates write amplification and for reliability, you probably have some sort of raid, that increase the number of places writes go to as well.


Not necessarily RAID, it might be an erasure coding.


The team here at Quizlet did a lot of performance testing on Spanner with one of our MySQL workloads to see if it's an option for us. Here are the test results: https://quizlet.com/blog/quizlet-cloud-spanner


What's the SQL and wire compatibility level? MySQL?

EDIT: Found quite a bit of my answers in your linked article:

> Cloud Spanner uses a SQL dialect which matches the ANSI SQL:2011 standard with some extensions for Spanner-specific features. This is a SQL standard simpler than that used in non-distributed databases such as vanilla MySQL, but still supports the relational model (e.g. JOINs). It includes data-definition language statements like CREATE TABLE. Spanner supports 7 data types: bool, int64, float64, string, bytes, date, timestamp[20].

> Cloud Spanner doesn't, however, support data manipulation language (DML) statements. DML includes SQL queries like INSERT and UPDATE. Instead, Spanner's interface definition includes RPCs for mutating rows given their primary key[21]. This is a bit annoying. You would expect a fully-featured SQL database to include DML statements. Even if you don't use DML in your application you'll almost certainly want them for one-off queries you run in a query console.

> Though Cloud Spanner supports a smaller set of SQL than many other relational databases, its dialect is well-documented and fits our use case well. Our requirements for a MySQL replacement are that it supports secondary indices and common SQL aggregations, such as the GROUP BY clause. We've eliminated most of the joins we do, so we haven't tested Cloud Spanner's join performance.

This seems like it'd prevent any kind of easy switch over to Spanner.


Just to be clear, the JOINS were removed for the vertical sharding prior to looking at Cloud Spanner. Cloud Spanner fully supports complex JOINS of many times (e.g. INNER, OUTER)

Details - https://cloud.google.com/spanner/docs/query-syntax#join-type...

Disclaimer: I work on Cloud Spanner


How did you go from 10s of ms for an update for 2012 Spanner, to sub-ms update (best case) for Cloud Spanner according to Quizlet? Did TrueTime get an order of magnitude better? Or is Quizlet measuring the wrong thing?


Sorry for the confusion but I meant the DML portion.


It sounded like you can only modify by primary key? Can you make a transaction that contains a query and a bunch of updates by PK ?

And yeah it makes it sound like writing an OEM adapter will be much more difficult.


From reading the docs, the answer seems to clearly be yes, but I'm open to being corrected.


It has the same basis as standard SQL in BigQuery: https://cloud.google.com/bigquery/docs/reference/standard-sq...


Seems more like they want all the data that ever existed for the database.


I'm reading the test results and had a question.

>> Cloud Spanner doesn't, however, support data manipulation language (DML) statements. DML includes SQL queries like INSERT and UPDATE. Instead, Spanner's interface definition includes RPCs for mutating rows given their primary key[21].

Does this mean I need to rewrite my application?

My application uses an ORM and it typically converts my logic to SQL statements and fires them off to Postgres. Would I need to change it such that it doesn't issue INSERT / UPDATE statements?


No, though it does imply you would need to write adapter code to have your ORM call the equivalent stored procedures for inserts or deletes. Most ORMs can do this to varying degrees.


yes.


> We've eliminated most of the joins we do, so we haven't tested Cloud Spanner's join performance.

The join performance is by far the most interesting part of this to me. A more traditional NoSQL solution sounds like it would have worked just as well for you, sans all the atomic clock fanciness. Joining across geographically disparate data is a real trick, and it seems like there would be some physical performance limits?


> Not every application can handle Spanner's ~5ms minimum query time, but if you can, then you can have that latency for a very high-throughput workload


This is the tradeoff we've all been looking for. Cool product, anyway!


Just wanted to say thanks for this writeup. This is really excellent, to the point I was passing around your blog post in lieu of the GCP announcement.


I'm curious about how you have MySQL configured. If the query cache is enabled you will see MySQL plateau. There's also the all important innodb_buffer_pool_size and innodb_flush_log_at_trx_commit. See this for more info https://www.percona.com/blog/2013/09/20/innodb-performance-o...


We typically do yearly performance audits with Percona to ensure our databases are optimally configured. We disable the query cache. We set innodb_buffer_pool_size based on a % of total memory (as MySQL will use more than that in total, much for allocating connection structs and things like that). We set innodb_flush_log_at_trx_commit to 0 which is not ideal for data integrity but gives us more performance. In practice, because of Google's Live Migration technology we have never experienced a crash due to hardware on our master nodes (LM will run an emergency migration for any server that has hardware that is detected to be going bad before it becomes a problem) and disks are abstracted away even further. Our main risk is the kernel crashing or MySQL crashing which we've been fortunate enough to not happen on our masters. Spanner provides ACID compliance by default with all the scaleability and performance we get out of it.


> So a query that accesses 10 rows in disparate parts of the primary key space will take longer than one where the keys reside on the same splits. This is expected with a distributed system.

No, why? Query can be executed in parallel.

BTW, isn't 20k/sec is a very very small performance for 30 node installation. Cassandra can handle 50k+ (both writes and read) on a single node. When in most queries you are trying to collect data from many nodes it will scale almost linearly.


I don't think that comparison holds. It's easy to push 50k+ on a single node, you're basically only resource bound on that machine. Pushing 20k+ on something that's globally consistent spread out over so many instances is a different exercise entirely. It also depends on the level of consistency you're asking from Cassandra. You'd probably need to set this to EACH_QUORUM or ALL to mimic the behaviour Spanner gives you.

And yes Cassandra will scale linearly-ish as long as you're in the same datacenter. Try running a geo-distributed 30-node Cassandra ring and it's a whole different story at that level of consistency and availability.


Sure, but you can make geo-replication and even implement proxy for providing consistency across DC, sure it won't be that fast in terms of latency, but throughput will be the same.


How does this proxy that provides cross-DC consistency work? And how does it guarantee that you'll get globally consistent reads?

Most cases I've seen latency can affect throughput plenty so I doubt your assertion that it won't affect throughput quite a bit. Even more so for anything that relies on TCP/IP.

I'd highly recommend reading the Bigtable[0] and Spanner[1] papers first and maybe then we can have a sensible and fruitful argument.

[0]: https://research.google.com/archive/bigtable.html

[1]: https://research.google.com/archive/spanner.html


Quickly skimmed this.. how is comparing Cloud Spanner to a VM running MySQL comparable?


it isn't


This release shows the different philosophies of Google vs Amazon in an interesting way.

Google prefers building advanced systems that let you do things "the old way" but making them horizontally scalable.

Amazon prefers to acknowledge that network partitions exist and try to get you to do things "the new way" that deals with that failure case in the software instead of trying to hide it.

I'm not saying either system is better than the other, but doing it Google's way is certainly easier for Enterprises that want to make the move, and why Amazon is starting to break with tradition and release products that let you do things "the old way" while hiding the details in an abstraction.

I've always said that Google is technically better than AWS, but no one will ever know because they don't have a strong sales team to go and show people.

This release only solidifies that point.


This isn't entirely accurate. BigTable was Google's earlier cloud database, and it's certainly non-traditional, and you have to build your application without traditional consistency guarantees, the way you describe.

Spanner doesn't exactly hide the details, but it lets you make transactions that span multiple shards. You still eat the cost of the transaction, you're just free from having to implement it at the application level, which is a more difficult and error-prone way of doing things. The bottom line is that if you need consistency, it needs to be implemented somewhere in your stack. If you don't need consistency (analytics workloads come to mind) then you have more flexibility with your database.

Disclosure: Google employee, reconstructing what I know from published information.


Same for GAE, or now GKE (basically microservices before microservices were a thing) vs. EC2. GAE was pushing a fairly non-traditional architectural model while others were just trying to provide cloud-based VMs.

Discolsure: Also a Google employee, also reconstructing.


Amazon: Create usual services and sell them.

Google: Make unique products that push the boundaries of what was previously thought possible.

Amazon: Don't care about inefficiencies and usage. Inefficiencies can be handled by charging more to the clients, usage doesn't matter because the users are mostly the clients and they don't feel their pain.

Google: Had to make all their core technologies efficient, performant, scalable and maintainable or they couldn't sustain their business.


Not fair.

Amazon: IaaS

Google: PaaS

Amazon is philosophy is being 'close to the metal' to allow Enterprise customer to migrate 'regular apps' into a 'regular environment' in the cloud.

Most of Google's offerings are (at least were) novel, but proprietary ways of doing specific things.

Amazon is not a laggard: they have provided a number of interesting and useful 'helper' things to facilitate IaaS - as well as a number of 'pure cloud' type things.

Amazon is very, very customer focused. Their products come from customer demands.

Google often 'cool things they've done internally' and exposes them, hoping that they might have some use-case in the rest of the world.

Google and Amazon are equally interested in profit.


Google is IaaS, their PaaS offering (App Engine) never gained much traction AFAIK. I also find the comparison fair, Google is a software engineering company, Amazon is a sales/marketing company.


At what point do infrastructure services become the platform? IMO, between GKE, Spanner, BigQuery, etc., it's basically a PaaS for non-trivial applications.



> Google: Had to make all their core technologies efficient, performant, scalable and maintainable or they couldn't sustain their business.

Which Amazon totally didn't have to do with their firehose of cash?


Google has to support Google, Youtube and many of the most resources intensive services in existence on Earth. They needed to be "efficient enough" to operate that, meaning incredibly efficient.

Amazon runs nothing, it's an outsourcing firm. They needed to make services "good enough" to be sold. If a service is somewhat inefficient, it just charges the clients more to cover the costs.

Technologies reflect the business they were created in.


> Amazon runs nothing, it's an outsourcing firm.

What the fuck are you talking about. It's one thing to say AWS services are "good enough" but "Amazon runs nothing" is a ridiculous statement.


Agreed, and I work on Google Cloud. We may have different styles and core businesses, but I wouldn't say "eBay ran nothing" either. Logistics alone is a super fascinating space!


Interesting. But Do Amazon actually do much Logistics? I thought they are much more Warehouse and outsourced delivery to DHL and UPS.


You do realize the way most services get into AWS is that they're first built in the retail side of Amazon (without any thought towards AWS) and then once people realize it's effectively solving an actual problem, it's rebuilt for AWS. Having to support Amazon retail is a pretty demanding stress test -- I'm not sure why you're getting this notion that Amazon doesn't run anything. I should think handling Black Friday alone would count for something..


That is something of a myth. AWS was created and evolves completely separately from retail, which didn't really use it in anger until 2010ish. Retail is effectively a large customer to AWS. They're very good at watching what customers are doing in general.


The 'rebuilt for AWS' phrase is key.


No one is saying amazon doesn't test their stuff. The argument here is that Google is inherently a more technical company, which is a fair comparison. Their products are more technical. Ad Sense, Gmail, YouTube are incredibly technical products due to their scale, and the argument here is that nothing of similar technicality exists in Amazon's core business, which I think is totally fair.


> The argument here is that Google is inherently a more technical company, which is a fair comparison.

I suspect that Google knows this, and their reputation for have poor customer support and sales comes from that knowledge.


Yeah that's all pretty accurate too. :)


I would phrase it as:

AWS prioritizes building blocks that support very high throughput and avoid leaky abstractions at all costs, and they're happy to push forward as long as these criteria are met. IMO they really succeed at this goal. Minus specific bugs that they're generally good about acknowledging, their services reliably do what they say they're going to. And they definitely solve a lot of problems for you, even if sometimes you're still required to get further into the weeds than you might want.

I'll buy that Google Cloud is better at questioning underlying assumptions and sometimes succeeds in releasing higher-level abstractions than AWS without any leakiness (a great example of this now being Spanner vs Aurora). It also feels to me that with releases like this Google is leveraging the full value of their own experiences running their services, and seems to be more advanced than amazon in some areas so this has a lot of value, whereas AWS seems to build a broader range of products with a specific customer in mind which is not necessarily themselves (e.g. all of their move your on-prem stuff to the cloud helpers).

If you consider Spanner vs Dynamo, it definitely matches up as Google wrapping the old way and Amazon forcing a new way (though to be fair, Dynamo was released 5 years earlier). But on the other hand considering Spanner vs Aurora, Amazon is the one embracing the old way with full MySQL & Postgres compatibility whereas Spanner sounds like a pretty dramatically different subset of SQL in not supporting insert and update statements. It's a very reasonable compromise for basically getting to ignore the CAP theorem, but it is a a significant difference that every developer will have to learn.


I would probably compare Dynamo to Google Datastore (released.. 8 years earlier?), or even to Bigtable, which went GA last summer. I'm not sure Spanner matches up with anything AWS at this point.

(work on google Cloud)


I think aurora is the closest.


The "old way" was sacrificing functionality such as transactions and joins to get scalability (BigTable, DynamoDB).

Google tried that a decade ago and found it lacking, this is why Spanner exists in the first place.


Well, I'd say the "old way" is SQL with joins and schemas and transactions, and the "new way" is KV with eventual consistency.


Chronologically, we have: SQL -> NoSQL -> NewSQL

You're both right.


I'm not sure if what you said applies, they have severe restrictions and spanner offers subset of MySQL functionality which is already bare compared to other databases. Changes can be done by primary key only, so it almost feels like a KV store that can do joins...

I don't think it's easy to port existing applications to use it and in the end you will still need to accommodate shortcomings in your application.


That's a fair assessment. But I'm assuming they will make it do more "SQL" things in the future. I could be wrong though.

Either way, they are trying to abstract away having to think about eventual consistency with this offering.


Spanner doesn't use SQL for writes it seems, so its still a significant rewrite for legacy applications (especially ones that dont use an ORM).

The thing thats really different here is Google are basically saying, heres this awesome system, yes it has obvious risks from partitioning, we are going to stake our reputation on those partitions not happening.

In contrast AWS are saying, this is DynamoDb, its really limiting but because of those limitations it should be pretty reliable as long as you write your application correctly.

It will be interesting to see if Microsoft and Amazon have to follow Google's lead here.


"I'm not saying either system is better than the other, but doing it Google's way is certainly easier for Enterprises that want to make the move, and why Amazon is starting to break with tradition and release products that let you do things "the old way" while hiding the details in an abstraction."

No, they said this in the F1 RDBMS or Spanner papers. They originally did the NoSQL, eventual-consistency type of stuff. This had app developers required to do a lot of work to avoid problems that model can create. Apparently, even their bright people had enough problems with it that they decided to eliminate or simplify that situation with stronger consistency. Took some brilliant engineering but now they have a database easy to use as old model with advantages of newer ones.

If anything, they learned some hard lessons with a good solution to them. Now, they're offering it to others. I was hoping they'd do this instead of keep it internal only. F1 and Spanner are amazing tech that could benefit many companies.


People used to make similar comparisons between the Russian and American space programs.


Oh yeah? Which was which in that comparison? I'm not familiar with that.


Presumably a reference to the classic "Pencil myth": Americans when faced with a need to write things down spent millions of dollars inventing a low gravity ballpoint pen and Russians just used a pencil.

As cutesy of a sentiment as it is, it's also full of misconceptions. The pens were invented by an American corporation that wanted better pens to sell in general (a smoother flow in a pen, regardless of gravity/orientation, is a better pen), and they saw a good opportunity to market the pen to NASA for use in space. Both NASA and the Russians used pencils in space, but the problems with pencils is the flakes can pollute an environment pretty quickly in low gravity and the pens turned out to be a much better solution. (So far as I've heard, every space agency these days buys similar pens.)


You presume wrong. Pens are not a space program.

I meant the differences in design philosophy that permeate aerospace engineering on both sides. Russian, built ugly but for strength and longevity. American, built for high capability with finesse and finer tolerances. The emergent properties of these different principles explain why Soyuz is still a preferred launch vehicle, but it was the Americans who got to the moon and operated the STS.

Amazon is more like the Russians: built in the knowledge that things fail, but less magical as a result. Google is more like the Americans: remarkable technology, you just need a herd of geniuses to run it.


Sorry if you think I made the wrong presumption, but I've seen the "pencil myth" trotted out many times as a "definitive" example of precisely the dichotomy that you have spelled out. (A low gravity ballpoint pen is high capability that requires finesse and fine tolerances; a pencil is ugly but strong and typically exhibits longevity... and so forth.)

It is an interesting analogy this dichotomy you see in the design philosophies (both between the space programs and the mega-corporations), but perhaps my point, if I were attempting a point, is to beware of false dichotomies.


You can see the same in comparing the military aircraft design. Both schools are ingenious, but it is applied in different ways. There are some excellent comparisons connecting principles to outcomes on Quora.


>and release products that let you do things "the old way" while hiding the details in an abstraction.

However, by 'abstracting' this away, you're not being forced to think about failure domains. If there is ever a massive country-wide connectivity break to the wider Internet (feasible for lots of people inside censored countries), you'll be pretty pissed when you can't use the DB services for your servers in the Google-local datacenter that you still have connectivity to because it can't get quorum.


Cloud Spanner is currently a regional service, not a global service. So you would only lose availability for failures within the region.


Exactly my point. I would say I personally prefer the Amazon way of forcing you to think about these things.


I'd encourage you to read the F1 paper and the Spanner paper if you haven't already. The big thing that stood out to me is that Google started that way (with BigTable) where you'd roll your own availability and transactions. It didn't go well.

That's the motivation behind both Spanner and F1: take the experience of how painful it is to do transactions on a Regional or Global level, and never make individual teams do it again.

I see it a bit like "Don't roll your own crypto". Clearly some people are exempted from it, but you better be able to tell me why you get an exception.

Disclosure: I work on Google Cloud and want you to pay for Spanner :).


Some interesting stuff in https://cloud.google.com/spanner/docs/whitepapers/SpannerAnd... about the social aspects of high availability.

1. Defining high availability in terms of how a system is used: "In turn, the real litmus test is whether or not users (that want their own service to be highly available) write the code to handle outage exceptions: if they haven’t written that code, then they are assuming high availability. Based on a large number of internal users of Spanner, we know that they assume Spanner is highly available."

2. Ensuring that people don't become too dependent on high availability: "Starting in 2009, due to “excess” availability, Chubby’s Site Reliability Engineers (SREs) started forcing periodic outages to ensure we continue to understand dependencies and the impact of Chubby failures."

I think 2 is really interesting. Netflix has Chaos Monkey to help address this (https://github.com/Netflix/SimianArmy/wiki/Chaos-Monkey). There's also a book called Foolproof (https://www.theguardian.com/books/2015/oct/12/foolproof-greg...) which talks about how perceived safety can lead to bigger disasters in lots of different areas: finance, driving, natural disasters, etc.


> perceived safety ... driving

I became a way better winter driver when I started intentionally fishtailing in snow and ice (in low risk situations).


There's also research showing that removing safety features, e.g. white lines between opposing lanes, increases safety.

http://www.bbc.com/news/uk-35480736


I experienced this in Drivers' Ed with special tires that were deflatable and inflatable under instructor control! Until I moved for grad school, I hadn't realized this wasn't more common.


I wonder how this will affect adoption of CockroachDB [1], which was inspired by Spanner and supposedly an open source equivalent. I'd imagine that Spanner is a rather compelling choice, since they don't have to host it themselves. As far as I know, CockroachDB currently does not support providing CockroachDB as a service (but it is on their roadmap) [2].

[1] https://www.cockroachlabs.com/docs/frequently-asked-question...

[2] https://www.cockroachlabs.com/docs/frequently-asked-question...


(Cockroach Labs CTO here)

Google launching Spanner is generally a positive thing for our industry and our product. It's more proof that what we're aiming for is possible and that there's demand for it. We expect that in five years, all tech companies will be deploying technology like ours.

One of the big differences is that Spanner only uses SQL for read-only operations, with a custom API for writes. We use standard SQL for both reads and writes, which means we also work with major ORMs like GORM, SQLAlchemy, and Hibernate (docs should be live today or tomorrow). Spanner's custom write API will make it difficult to work with existing frameworks, or to convert an existing application to Spanner.

Cloud Spanner only works on Google Cloud and is a black-box managed service. CockroachDB is open source and can be run on-prem or in any cloud on commodity hardware. (We don't offer CockroachDB as a service yet, but may in the future)

At this point, both products are still in beta and are still missing features like back-up and restore (according to the Quizlet blog post). We plan to launch CockroachDB 1.0 with back-up / restore enabled.

* For anyone wanting to know more about how we make CockroachDB work without TrueTime, see our blog post: https://www.cockroachlabs.com/blog/living-without-atomic-clo...


> Google launching Spanner is generally a positive thing for our industry and our product. It's more proof that what we're aiming for is possible and that there's demand for it. We expect that in five years, all tech companies will be deploying technology like ours. Echo on this! It's truly exciting moment for each and everyone in the field.


Exciting times on the horizon for Cloud technologies. Godspeed.


I for one would love to see a hosted offering of cockroachdb!


Would Cockroach 1.0 comply with SQL:2011?


(CockroachDB CTO here) We haven't implemented everything in the standard yet (Nor will we by 1.0 - there's a lot of stuff there!), but we are aiming to ultimately be compliant with the SQL standard. For example, when we introduced "time travel queries" (https://www.cockroachlabs.com/blog/time-travel-queries-selec...) we adopted the SQL-standard syntax "AS OF SYSTEM TIME" (as opposed to the non-standard out-of-band parameter used in Cloud Spanner)


The main sales pitch of Cloud Spanner is Google's network infrastructure.

No startup will be able to replicate that anytime soon, a lot of time (and money) has been put into it by a lot of people over a long time.


Curious: is there any company in the world that could replicate its breadth, performance, and reliability in the next decade?

Could any government? Has any government?

My impression is that, infrastructure wise, Google is genuinely in a class of size one.


Its probably a class size of 2, with Amazon. Beyond those two though, no one else is close.


I honestly don't think Amazon is even close to Google.

How much more infrastructure do they have besides AWS? How much does Google have besides GCP?


That's ignoring just how much larger AWS is than GCP.


Sure, but those are the public offerings - the largest part of Google's infrastructure is not public.


Capital expenditures might shed some light on this. I don't think there's enough public data to be clear but in 2015 Amazon (4.8B), Google (9.9B) and Microsoft (5.9B) were at least on the same order of magnitude in terms of CapEx, whereas other major "datacenter" companies like Rackspace (475M) are much smaller.

I don't think you can draw any definitive conclusions from this, but calling it a class of size 1 or 2 is probably an overstatement of Google (+/- Amazon)'s advantage over Microsoft at least.


What? Close to what? There are many many companies with lots of network infrastructure. Google and Amazon are not by themselves.


Ever heard of Facebook ? :)


> Could any government?

NSA's annual budget is $50bn. U.S. military budget is about $600bn.

Google's revenue is $90bn and they don't spend all of it.


AWS, they just don't talk publicly about it so much.


https://www.cockroachlabs.com/blog/living-without-atomic-clo...

> A simple statement of the contrast between Spanner and CockroachDB would be: Spanner always waits on writes for a short interval, whereas CockroachDB sometimes waits on reads for a longer interval. How long is that interval? Well it depends on how clocks on CockroachDB nodes are being synchronized. Using NTP, it’s likely to be up to 250ms. Not great, but the kind of transaction that would restart for the full interval would have to read constantly updated values across many nodes. In practice, these kinds of use cases exist but are the exception.

CockroachDB is waiting for time keeping hardware to improve.


Eric Brewer's post on Cloud Spanner mentioned that Google intends to expose TrueTime to customers at some point. If/when that happens, it would be very interesting to see CockroachDB's performance on Google Cloud. (They might have to do some engineering work to accomodate whatever TrueTime API is exposed, but when timekeeping is fundamental to your product, that seems worthwhile.)


If the clock offset is too high (more than 250ms), we should use another transaction model, Google Percolator is a good fit before the unforeseeable improvement in the hardware. Based on the monitoring of clock offset on cloud, TiDB chose to use the timestamp oracle to allocate timestamp, which is much faster.


Maybe you could help fund Eric S. Raymond to improve NTPD, he might have some good ideas about improving normal PC-class hardware cheaply too.

https://www.ntpsec.org/


CockroachDB can be hosted on any cloud for a fraction of the cost, I'd think that's a huge advantage for small/solo startups.


Given that Spanner starts at $650/mo/node + storage costs, I think Cockroach could still see huge usage as a self-hosted alternative.


That's not very much given the capabilities and managed service. Anything cheaper probably means the single-node managed SQL offerings are more than enough.


But single node with unlimited room to grow is always a good value prop. Cockroach can market that, but Spanner can't.


I imagine the globally distributed database market is big enough for more than one winner. The presence of competitors can sometimes even be a boon, increasing the visibility of a market's goods relative to other similar goods.


And assuming some reasonable compatibility/portability/migration story it can help in reducing the rational fear of proprietary lock-in.


I am also interested in how it compares to NewSQL databases like NuoDB. NuoDB has been positioning itself as a very similar type solution (no compromise relational distributed database) to Cloud Spanner for a while (minus the cloud hardware provided for you).


For those trying to compare this with AWS Aurora, Aurora is more a regular database (MySQL / Postgres) engine with a custom data storage plugin that's AWS/ELB/SSD/EFS-aware. Because of this the database engine can make AWS specific decisions and optimizations that greatly boost performance. It supports master-master replication in the same region, master-slave across regions.

Global Spanner looks like a different beast, though. It looks like Google has configured a database for master-master(-master?) replication, across regions and even continents. They seem to be pulling it off by running only their own fiber, each master being a paxos cluster itself, GPS, atomic clocks and lot of other whiz-bangery.


From the technical blog post

> Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA. The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system. However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages. For example, given there are many sources of outages for an application, if Spanner is an insignificant contributor to its downtime, then users are correct to not worry about it.

Basically, the underlying system is CP, but A is so high (because of the custom fiber, paxos etc) that they're rounding it off to 100% and calling it CAP.


Except that A in CAP has nothing to do with overall system's availability over time and using it as such is just confusing.


But it makes sense in this case. The system guarantees CP. But as customer it looks like you're getting CA as well, because A is so high. If you drink the kool-aid, you get C & A & P.

The kool-aid isn't too bad, though if they can measurably guarantee A > 99.999999%, I'm happy to round off to 100% and call it CAP.


The availability is "only" 99.999%, which IMO is still really high!

(I work for Google Cloud)


https://uptime.is/99.999 puts that at:

Daily: 0.9s

Weekly: 6.0s

Monthly: 26.3s

Yearly: 5m 15.6s


Where does it say that? Does google cloud spanner guarantee three nines uptime?

On the spanner page it says:

>> This feature is not covered by any SLA

;) ;)


Once Cloud Spanner leaves beta, it will be covered by an SLA. GCP Alpha/Beta products don't have SLAs for the most part.


So essentially a marketing buzz. "It's always available because we spent a lot of money to make it available"

CAP is about how the system deals with partitions not whether it has partitions or not.


The person who wrote this "marketing buzz" is Eric Brewer, who was the originator of the CAP theorem in the first place. I think he knows what it was about :)


See my other comment in this thread on how they still used misleading language in their marketing, regardless of who the author is.

Don't fall for these PR tricks. Hiring thought leaders/influencers is one that should be easy to spot. Best to judge claims on their own merit and not by the person that speaks them....


What am I supposed to think about someone who creates a throwaway account called "the_cap_theorem"? You look like a competitor that's freaking out and throwing FUD.

I don't think Eric's words were misleading. People will judge Cloud Spanner on its own merit, but given the reputation Eric has both academically and professionally, I think Cloud Spanner is deserving of attention, and is not a "PR trick".


I don't care who Eric is, it is still misleading. He knows he is providing misinformation and he tries to cover his a__ by adding that technically it is a CP system, when a partition happens. And that's what matters.

Saying it is CA because Google threw some money to reduce chances of a partition happening is very misleading. By that definition you could take any RDBMS and claim that is CA because with enough money you can make sure hardware won't fail. It defeats the purpose what CAP is being used for, which is telling us how the system will behave when partition does happen. It defeats the purpose what Aphyr was doing with his Jepsen tests which observe how given storage solution behave when there's a network partition.

The claim that Eric Brewer wrote regarding Spanner reduces makes him lose him credibility to me at least. It almost feels like he was hired by Google, just so he could claim that.


I think you've somewhere lost the plot of what Google has achieved here, and in no way does it detract from running Spanner against Jepsen.

Google is claiming they've built a geographically distributed CP system with linearizability and 99.999% global availability. They haven't defeated CAP because A is about total availability in the face of network failures. But for most practical purposes they've achieved exactly what most people want in their database. Consistent and mostly available except perhaps 30 seconds a year. That is why it is completely acceptable for Eric to say this is a "effectively" a CA system.

Eric was hired to help BUILD this, not market it.

Aphyr also has no issues with Eric's comments (read Twitter) - Spanner a CP system with better latency due to the use of TrueTime.


> Aurora... supports master-master replication in the same region

I don't believe that's true, but I could be mistaken?

(Work at Google, not on Cloud Spanner)


Yeah, think you're right there. The exact text is this

> Each 10GB chunk of your database volume is replicated six ways, across three Availability Zones. Amazon Aurora storage is fault-tolerant, transparently handling the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. Amazon Aurora storage is also self-healing. Data blocks and disks are continuously scanned for errors and replaced automatically.

This blurs the lines a bit - the DB engine itself isn't doing the master-master replication, but the underlying storage engine is handling that. Your transactions are effectively running with synchronous replicas as if the system is master-master, but the engine itself is not. So you get the data protection guarantees from a master-master system, but not the availability.


Aurora only supports single master.


The white paper is available here: http://static.googleusercontent.com/media/research.google.co...

for anyone interested



I bet there will be a lot about CAP theorem in the comments:-)


I wonder why they charge a minimum of $0.90 per node-hour when they offer VMs for as little as $0.008/hr. This is hugely useful even for single-person startups, so why charge a minimum of ~$8,000 per year?


Concur. The pricing makes sense for their "target market" of folks who currently have a "bursting at the seams" MySQL or PgSQL instance, but it locks out folks just getting started with a tiny database and low load. This seems like bad positioning: the "bursting" folks will have to decide between the cost of re-hosting their whole system on the Cloud Spanner and trying to incrementally keep their current platform running; the small folks who would like to organically grow on a platform without scaling limits are locked out of the low end, so by the the time they are big enough to need Cloud Spanner... they too with be forced into the "re-host or muddle on" decision.


This seems like a real problem. The "bursting at the seams" people will need a pretty significant rewrite to run on Spanner, which seriously limits the appeal.

The ideal market for Spanner seems to be new projects developed by big companies that know that big traffic will arrive on day 1. In other words, Google. Which I guess shouldn't be too surprising...


> The pricing makes sense for their "target market" of folks who currently have a "bursting at the seams" MySQL or PgSQL instance

My intuition, which I hope is wrong, suggests this is a small market.


To be fair, you shouldn't even run MySQL on an f1-micro. This is more on par with an 18 vcpu raw server, before you'd even consider any value that the software provides.

I'd certainly love to see us get to a world where we can split up a single spanner "install" in an isolated, multitenant manner, but even for a small company, $8k/year is admittedly a small fraction of one engineer. At a company with several, you can share your single Spanner instance just like you would any other database.

Disclosure: I work on Google Cloud (but not Spanner).


> even for a small company, $8k/year is admittedly a small fraction of one engineer

This is certainly true, but overlooks the fact that many major products start out as experimental projects, and $8k/year is a significant investment for an experiment.

If there's a reasonable upgrade path from traditional databases like MySQL and Postgres this shouldn't be a big deal, but if the answer is "rewrite your app" it will probably be a friction point for adoption.


Hugely useful but also hugely different from an engineering/coverage perspective, perhaps.

Companies with more data than can fit in a single-instance RDBMS system (like >3TB of hot data, more throughput than a single node can handle) but still seeking transactional consistency are a clear use case. Single-person startups could definitely benefit, but it's a less-likely scenario that they would require the level of coverage Spanner provides.


But the most successful products tend to be ones that can scale from zero to global. DynamoDB is a great example.


Because you're paying not for cost plus, but value added. To make a meaningful cost comparison, the best alternative must be considered, which is you spending your own labor to engineer and build a similar system yourself.


A Spanner node is not related to a VM instance, they are entirely different concepts. A Spanner node refers to a collection of regional (and eventually global) compute and storage resources, while a VM instance is just a virtual machine running on a single physical host.


Noticed that too, this is not for 'one guy mini-startup', but 'one guy' can already run small loads for almost free in vps.

Having your database problem solved however, one less thing to worry about, if you're bigger.


because if you aren't spending 8k/yr on your database then you don't need this level of scale


A cheap VM is not likely to give the same type of performance.


Why should useful things be cheap?


Amazon likes to respond to Google with it's own price drops and product launches. It's telling that their announcements are orthogonal instead of direct competition with Spanner.

When Google announced Spanner back in 2012, I'm sure Amazon and Microsoft started teams to reproduce their own versions.

Spanner is not just software. The private network reduces partitions. GPS and atomic clocks for every machine help synchronize time globally. There won't be a Hadoop equivalent for Spanner, unless it includes the hardware spec.


Amazon already has Aurora: https://aws.amazon.com/rds/aurora/details/

You're right that there's literally nothing else out there that has tight synchronization using atomic clocks, though.


Aurora is a toy compared to Spanner.

Single region, limited backups and replication topologies, limited performances.

There is yet to see if Spanner can achieve the expectations, if it does, it's a game changer.


> achieve the expectations

It has for Google internally. No reason why they can't share the service externally.


Actually, there are lots of reasons why it's hard to share a service externally. Making a service public means you have to deal with lots of new problems like billing, abuse, dealing with lots of small users instead of a few large ones, stronger backwards compatibility requirements and so-forth.


Yes, but those are orthogonal to the scaling issues that I presume user5994461 was thinking of.


That's kind of the point, when you make a service like Cloud Spanner public you create new scaling dimensions. New scaling dimensions means new problems to solve that are not trivial.


And because of that, Aurora's multi-zone replicas are read-only.

I just noticed Google says the cross-region feature is coming later in 2017. Amazon might be planning to announce a similar change for Aurora in the coming months.


Thomas Watson in 1943 amd his famous quote: “I think there is a world market for about five computers".

If he was alive, he could say these computers are Google, Apple, Microsoft, Amazon and Facebook.



You win :D


How does this compare to AWS Aurora in terms of pricing and performance?

With Aurora the basic instance is $48/month and they recommend at least two in separate zones for availability, so it's about $96/month minimum. Storage is $.10/GB and IO is $.20 per million requests. Data transfer starts at $.09/GB and the first GB is free.[1]

Spanner is a minimum of $650/mo (6X the Aurora minimum), storage is $.30/GB (3X), and data transfer starts at $.12/GB (1.3X).

Of course with Aurora you have to pick your instance size and bigger faster instances will cost more. Also there's the matter of multi-region replication, although it appears that aspect of Spanner is not priced out yet. So maybe as you scale the gap narrows, but it's interesting to price out the entry point for startups.

[1] https://aws.amazon.com/rds/aurora/


Aurora replicas appear to be read-only, according to that link.


Sure, but to compare accurately we should look at where that impacts performance in practice. How many writes can Aurora's largest instance handle? What's the write latency from other parts of the globe?


Forgive my ignorance, but could someone explain in layman's terms in which situation this would be helpful? E.g. if I have 1TB of data would I use this? If I have 1GB with a growth rate of 25GB/daily would I use this?


The rule of thumb I have always been told is that you can push MySql to ~200 GB total and between 99.9% and 99.99% availability (between 1 and 9 hours of downtime per year). Your milage might vary, but these are probably the right orders of magnitude. There is also an iops limit but that's harder to put a clear limit on because it's workload dependent.

If you need more storage, availability, or iops, the current recommendations are shared relational stored or NoSql.

Sharding relational databases means instead of turning on 1 MySql machine, you turn on 16 MySql machines and store 1/16th of the data on each machine. It only allow queries patterns that can be responded to by a single shard (so if you are only doing queries on a per user basis, your golden, otherwise, not so much) and require that your hottest shards (e.g. most active users) should only be a few orders of magnitude larger than your coldest (Justin Bieber probably needs his own machines at Twitter because he has so many users). There are ways to get around this, but they are tricky and will require a lot of developer time.

With NoSql (riak, cassandra, dynamo, etc.), you have to give up many features that make developers lives easier in order to scale beyond a single machine. Each flavor has it's own set of tradeoffs, but the biggest problem is that you loose (1) committing multiple rows at the same time (atomic commit) and (2) ensuring queries don't overlap (isolation).

Spanner offers an alternative where the developer (for the most part) doesn't have to know that her data lives on multiple different machines. It gives you 99.999% availability and theoretically unlimited scaling to any size, with an API that is still relatively developer friendly (compared with NoSql or Shared Sql). One cost here is latency / performance. You need to write to multiple computers across the world so write speed will be bounded at a minimum by the speed of light between those datacenters.

Short answer, there are a lot of companies that could benefit from a system like this. It could give higher availability to a small Sql dataset or stronger guarantees to a current large NoSql database. There is even an argument to be made that the majority of problems companies face today are best solved with this type of system and relational or NoSql optimized workloads are the exception, not the rule.


1 to 9 hours of downtime a year for MySQL seems on the high side. With Aura, the failover times are much less and outside of the cloud, if there's a "noc", they can failover quicker than that.


You may benefit from something like this if you do an analysis of your DB usage/growth patterns and determine that server/cluster scalability is going to be a non-trivial concern in the near future. Volume of data doesn't matter (to a certain point) if you seldom query that data. 1000TB of data that is neither queried nor updated doesn't require a database at all, for example.

So, as with all things engineering, it depends. You know you'll benefit from it if you can demonstrate that you'll benefit from it an that the effort/risk/etc. is lower than some other solution you could choose. If you're unable to determine this, you should bring in an expert who can.


> Today, we’re excited to announce the public beta for Cloud Spanner, a globally distributed relational database service that lets customers have their cake and eat it too: ACID transactions and SQL semantics, without giving up horizontal scaling and high availability.

This sounds too good to be true. But it's Google, so maybe not. Time to start reading whitepapers...



Link to the actual OSDI paper (not the simpler whitepaper) https://static.googleusercontent.com/media/research.google.c...


Looks cool, but the pricing seems a bit non-cloud-native (or at least non-GCP-native).

"You are charged each hour for the maximum number of nodes that exist during that hour."

We've been educated by Google to consider per-minute, per-instance/node billing normal - and presumably all the arguments about why this is the right, pro-customer way to price GCE apply equally to Cloud Spanner.


The per-minute billing is an advantage when you are scaling up and down rapidly. If you use VMs just for 5 minutes, per minute pricing is 20x cheaper than hourly billing.

However with a database it is rare to scale up and down rapidly. Rather you expect change over the order of days. Imagine you go from 10 instance to 15 instances over a week. per minute billing only saves a possible 5 instance-hours over the week compared to hourly billing, which is less than 1% saving.


While everyone is puzzling over how Spanner seems to be claiming to be CA, I would like to take this opportunity to bring up PACELC[1].

The idea is that the A-or-C choice in CAP only applies during network partitions, so it's not sufficient to describe a distributed system as either CP or AP. When the network is fine, the choice is between low latency and consistency.

In the case of Spanner, it chooses consistency over availability during network partitions, and consistency over low latency in the absence of partitions.

1: http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf


It's not claiming CA. It's claiming CP with A strongly implied thanks to a massive amount of Google engineering and private networking.

https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...


I think A is wrongly implied, tried to explain it here: https://news.ycombinator.com/item?id=13645925


I said "seems to", since a lot of people seem to think it :P


> clients can do globally consistent reads across the entire database without locking

How is this possible across data centres? Does it send data everywhere at once?

Seems too good to be true of course but if it works and scales it might be worthwhile just not having to worry about your database scaling? Still I don't believe it ;-)

EDIT: further info...

> Spanner mitigates this by having each member be a Paxos group, thus ensuring each 2PC “member” is highly available even if some of its Paxos participants are down. Data is divided into groups that form the basic unit of placement and replication.

So it's SQL with Paxos that presumably never get's confused but during a partition will presumably not be consistent.


> In terms of CAP, Spanner claims to be both consistent and highly available despite operating over a wide area, which many find surprising or even unlikely. The claim thus merits some discussion. Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA. The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system.


I would expect more from Brewer.

"CA except when there are partitions" is CP. It's not "effectively CA".


No, he's saying it's effectively CAP because the A downtime is so small.

It's one thing to do that for a key-value store. Entirely another to support joins on a globally distributed database. This ain't just one availability zone. Spanner is amazing.

It took them a few years to make it a service, but when they announced its use internally a few years ago, it seemed like the nail in the coffin for in-house database hosting.


I understand what he's saying. It's marketing.

There's nothing wrong with saying it's CP, but since we control everything there's extremely rare P. Then he can show availability numbers (which he kinda does).

Saying it's "effectively CA" defeats the point of the CAP theorem, which says you have to make tradeoffs. See: https://codahale.com/you-cant-sacrifice-partition-tolerance/


> It's marketing.

No, it's engineering. It's the recognition that if periods of unavailability are too small and too rare to be noticed, then the system behavior is indistinguishable from an "available" system in the sense of the CAP theorem.

It's like the "Retina" display you're probably reading from. There are pixels, you just can't see them.


Another point is that since all records are globally timestamped, you can do a read that is consistent at a timestamp in the past (i.e. read data as the database was 1 second ago, or something like that).

If data from other places has synchronized to your zone, you may be able to do this globally-consistent read while only touching your local datacenter (because TrueTime guarantees that no other records anywhere in the system will be created at the time you are querying).

Note: I work at Google, but I don't know more about Spanner than the Spanner paper.


Check out the papers. They revealed Spanner a few years ago. Other commenters have provided links.


I was like YES!!! Then I read for a single node it is 90 cents per hours then I was like NO!!! so absolute minimal cost for me is $648/month? I was hoping there was like a dev version. Maybe I didn't read the fine print?


I was thinking the same thing! Maybe it's better to look into some smaller service which offer e.g. PostgreSQL as a Service like https://www.elephantsql.com/plans.html or https://www.databaselabs.io/pricing/ (I'm sure they are even more, just done a quick googling). Does somebody has experience with such services?


PostGres at least offers full text indexing. I couldn't find that on Cloud Spanner.


One thing to note is Spanner's transactions are different compared to what you get with a traditional RDBMS. See https://cloud.google.com/spanner/docs/transactions#ro_transa...

An example is the rows you get back from a query like "select * from T where x=a" can't be part of a RW transaction. I believe because they don't have the time-stamp associated with them. So, you have to re-read those rows via primary key inside a RW transaction to update them. This can be a surprise if you are coming from a traditional RDBMS background. If you are think about porting your app from MySQL/PostgreSQL to Spanner, it will be more than just updating query syntax.

Disclaimer: I used F1 (built on top of Spanner, https://research.google.com/pubs/pub41344.html) few years ago.


>> Remarkably, Cloud Spanner achieves this combination of features without violating the CAP Theorem.

This is the best weasel PR language I have seen in a long time.

Note that the sentence does not actually proclaim that they solved (the previously "unsolvable") problem of achieving distributed consensus with unreliable communication while maintaining partition tolerance and availability.

The blog only says they don't "violate" the CAP theorem -- whatever that means. So the statement is technically correct. Still the intention is obviously to mislead the casual reader (why else would you start the sentence with "Remarkably"?).

A litmus test: The same statement is true for MySQL - or _any other_ database in fact:

  >> "Remarkably, MySQL achieves this combination of features without violating the CAP theorem"
It's a bit like saying

  >> "Remarkably, MySQL is not a perpetuum mobile"


Given that CockrochDB is based on Spanner and F1, this DBaaS sounds like it will compete directly with them.


> Given that CockrochDB is based on Spanner and F1, this DBaaS sounds like it will compete directly with them.

Except that it's not. CockroachDB uses MVCC. Spanner uses 2PL and F1 uses OCC. Not the same at all.


Is JSON data type support in the works? Seems to be a very commonly requested feature these days.


Yes, it's something we have on the roadmap. We will adjust priority based on the level of demand so thanks for your vote ;-)

(disclaimer: I work on Cloud Spanner)


Dumb question: Since strings are supported, couldn't you just store some JSON as a string? Or are you talking about supporting queries that involve parsing the JSON on the "server"?


Yeah, parent probably meant queryable JSON types, a la Postgres[1].

1: https://www.postgresql.org/docs/9.4/static/datatype-json.htm...


Not dumb, I was being inexact. I was talking about Spanner understanding the semantics of JSON, similar to Postgre's JSONB or MYSQL's JSON.


I think the OP was saying "dumb question" as in "I have a dumb question" rather than "this is a dumb question". At least that's the context I inferred.


I can vouch for Spanner: it's a badass piece of Google's infrastructure.


Related, I wrote a blog post on the network latency between Google Compute Engine zones and regions. I'm assuming Cloud Spanner will still have these latencies once multi-region is deployed. Cross-zone latency on GCE is very good though.

https://blog.elasticbyte.net/comparing-bandwidth-prices-and-...


Oh this looks really compelling! Though I'm guessing this is targeted to companies? I'd love to use this for some personal projects but the pricing seems really high. Am I reading it right that a single node being used at least a tiny bit every hour is about $670 a month?

Maybe I'm misunderstanding how the pricing works here. Any clarification would be highly welcomed :)


It's targeted to large datasets more than companies. There isn't really any advantage of a single node Cloud Spanner instance over Cloud SQL. Cloud Spanner becomes worthwhile when you have more data/throughput than a single node system can support, at which point the pricing is competitive with other options.


Fair enough. I just kinda wanted to play with it on my next project :)


What is TrueTime really? Are their Distributed Systems 'sharing a global clock'?


From the Spanner paper:

> The underlying time references used by TrueTime are GPS and atomic clocks. TrueTime uses two forms of time reference because they have different failure modes... TrueTime is implemented by a set of time master machines per datacenter and a timeslave daemon per machine. The majority of masters have GPS receivers with dedicated antennas; these masters are separated physically to reduce the effects of [GPS] antenna failures, radio interference, and spoofing. The remaining masters (which we refer to as Armageddon masters) are equipped with atomic clocks. An atomic clock is not that expensive: the cost of an Armageddon master is of the same order as that of a GPS master.

Source: https://static.googleusercontent.com/media/research.google.c...



Here's a very rough (and non-practical due to the large time error) implementation of TrueTime that I wrote a while back that uses the regular linux apis:

https://stackoverflow.com/questions/18384883/why-is-googles-...

I suppose if you spent some serious effort with ntpd (maybe hook it up to a PCI ClockCard), you could get this approach to work.


Does it support spatial objects / can it replace PostGIS?



> This leads to three kinds of systems: CA, CP and AP,

What is a distributed system that is CA? Can you build a distributed system which will never have a partition.


They answer your question just two lines under the one you are quoting:

> For distributed systems over a “wide area,” it's generally viewed that partitions are inevitable, although not necessarily common. If you believe that partitions are inevitable, any distributed system must be prepared to forfeit either consistency (AP) or availability (CP), which is not a choice anyone wants to make. In fact, the original point of the CAP theorem was to get designers to take this tradeoff seriously. But there are two important caveats: First, you only need to forfeit consistency or availability during an actual partition, and even then there are many mitigations. Second, the actual theorem is about 100% availability; a more interesting discussion is about the tradeoffs involved to achieve realistic high availability.


> If you believe that partitions are inevitable, any distributed system

How does that answer it? Are they implying that partitions will not happen if you don't believe in them?


A CA system is not a system that doesn't have partitions, it's a system that works under the condition that there are no partitions (ie. it is a non-partition-resistant system).


> , it's a system that works under the condition that there are no partitions

But that is not a choice with distributed systems, unless as soon as a partition happens the system shuts down immediately. That is it effectively disappears and never re-appears again. But that's not a CA system then?

Or saying that it is not partition resistant is also difficult because the system in case of a partition will do _something_. The typical choices is that it either responds to clients (trying to be AP) or it doesn't (trying to be CP).

That is why I understand CA systems as equivalent to the belief that "partitions can't happen". Which I think is unrealistic.


This is a system which is guaranteed to be consistent and available as long as there are no partitions. If there are partitions, the system isn't guaranteed to be either.

Yes, it's a pretty much useless system.


A memcache cluster might qualify. If there's a partition, just forget about the missing nodes. It's just a cache anyway.


The system you describe would typically be referred to as AP, because it maintains availability in the presence of partitions.


Trivially, by never allowing either half of a partition to make progress while the partition is in place. Since the CAP theorem, by itself, doesn't put a cap(oof) on latency, it is valid to consider a system CA, if it if always available to listen to requests while partitioned, but never able to fulfill them.

This, of course, is effectively useless in practice, and is dependent on an infinite buffer of pending operations, etc.

https://brooker.co.za/blog/2014/07/16/pacelc.html


> Trivially, by never allowing either half of a partition to make progress while the partition is in place.

Doesn't availability mean getting a response on success or failure. If during a partition there is no response on success of failure how is the system available? It seems re-writing a term like "x will happen" to "x will happen after an infinite timeout" should not be valid


It does, but within what bounds? The CAP theorem doesn't specify. One could assume that it means before the partition is restored, but that is only one possible valid interpretation. The PACELC theorem, which is by no means the last word on the story, clarifies this well:

https://en.wikipedia.org/wiki/PACELC_theorem

"PACELC builds on the CAP theorem. Both theorems describe how distributed databases have limitations and tradeoffs regarding consistency, availability, and partition tolerance. PACELC however goes further and states that a trade-off also exists, this time between latency and consistency, even in absence of partitions, thus providing a more complete portrayal of the potential consistency tradeoffs for distributed systems."

And I would take that argument one step further and say that latency and partitioning are effectively identical, and from the point of view of any given operation, it is impossible to say whether the system is in partitioned state until max lateny (timeout) has elapsed, because failure to make progress within timeout is the only meaningful definition of partion-induced unavailability.


Aphyr covers the impossibility of practical CA systems here: https://aphyr.com/posts/325-comments-on-you-do-it-too

Sometimes it's helpful to consider a distributed system through the lens of "harvest vs yield" where harvest is the proportion of information in a system reflected by a response, and yield which is the probability of receiving a response at all.

https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24...


That's a good summary by Aphyr he said it better than me of course.

Now I know it is Eric Brewer himself who wrote that announcement. So I make no claim disproving him or knowing better than, I am just struggling to understand the implications. That CA terms has caused a lot of confusion (at least to me) and so far from what I have read talking about CA systems doesn't make sense.

Never heard of harvest & yield that's an interesting perspective. Thanks for the link, I'll need to go and ponder that for a while. It seems to take a probabilistic approach to these constraints.

But even in that article the description of CA systems is confusing. They describe it it means system can work only in the absence of a partitions. Ok, but then what does it do in the case of partition? The system will presumably do something accept requests, refuse to reply, ... I don't see how it can be both C and A though. That is why I equate that choice with a statement of "I believe partitions will never happen", which I personally think is not realistic.


A useful perspective for me has been that C, A, and P are spectra rather than all-or-nothing. A system may bend a bit on availability when some of its replicas are partitioned, because there are fewer servers that can respond at all, smaller aggregate queue depth etc... It has been said that consistency was specified rather poorly in Brewer's original paper, and we've since seen work that dives into the consistency spectrum. Peter Balis's "HAT not CAP" paper is an often-cited example of such work. Aphyr again comes through with a summary of some of these ideas in https://aphyr.com/posts/313-strong-consistency-models .

We know that we can't have 100% harvest and 100% yield in the presence of partitions. We can play with different things though to shape the curve of these things as conditions worsen, however


Few questions from reading the docs:

1) How big can all the colocated data for a single primary key get before they don't fit within a split? Can I implement a GMail-like product where all the data for a single user resides within one split?

2) Is there a way to turn off external consistency and fall back to serializability? In return you get better write latencies. This is similar to what CockroachDB provides?


Here is a very interesting video from 2013 of Martin Schoenert explaining the Google Spanner White Paper (In german though): https://www.youtube.com/watch?v=2QKewyoOSL0

Now he works for Google as an Engineering Manager.


Doesn't seem possible to use this yet. No client libraries and no samples: https://cloud.google.com/spanner/docs/tutorials

Have they documented the wire protocol? I couldn't find it.


I work on Cloud Spanner and client libraries are rolling out right now, but API definitions are available.

RPC: https://cloud.google.com/spanner/docs/reference/rpc/ Rest: https://cloud.google.com/spanner/docs/reference/rest/


Anyone working on a Rust lang client-library?


We're still working on rolling a few docs throughout the day. For example - here's the node.js lib:

https://github.com/GoogleCloudPlatform/google-cloud-node#clo...



> If you have a MySQL or PostgreSQL system that's bursting at the seams

Postgresql ? How does this work for people migrating from traditional SQL databases - typically people use ORM. How would this fit in with, say , Rails or SqlAlchemy ?


There is real work to migrate you application to use Cloud Spanner. Your schemas and queries will work with some tweaking but we don't have ORM support at beta.

We are going to produce some additional collateral on migrating from popular RDBMS, but the Quizlet post is the best reference right now for migrating from MySQL.

https://quizlet.com/blog/quizlet-cloud-spanner

We have released client libraries for Java, Go, Node and Python on Github, but we haven't used those clients to implement support for popular ORMs.

Basically, the open-source ecosystem will need to add support for these ORMs but we will contribute wherever we can to push these initiatives. If we get lots of demand for a specific ORM, we will look into doing something special for that.

Hope that helps.

(disclaimer: I work on Cloud Spanner)


There is JDBC support, so if you're willing and able to connect that way, you could hope for an easier migration. Any move from one database to another is a migration, even if only to deal with dialect-specific things.

Disclosure: I work on Google Cloud (but not Spanner).


Very interesting. How does this pricing compare to AWS Aurora? https://aws.amazon.com/rds/aurora/pricing/


Not sure. If you need to scale beyond a single master, Aurora won't help in the same way Spanner does though. You can dial up the number of nodes in Spanner dynamically under load with good results.


It's not fair to compare these since Aurora is a traditional relational database. It does not have a horizontal scalability solution aside from read replicas.


So does Cloud Spanner replace the existing Google Cloud SQL offering [1]? What are the pros/cons of each?

[1] https://cloud.google.com/sql/


That offering is just normal managed MySQL, the new thing is a custom built database for huge scalability, much more expensive of course.


Right that's what I was getting at, a typical web app does not and probably should not start with Cloud Spanner (ignoring the fact it costs $0.90 per hour per node). Cloud Spanner seems like it is attacking the big data market correct?


Something like that. But not necessarily the Big Data that's being analyzed, that's usually done with Hadoop/Spark/whatever is the big thing now. It seems to be aimed at, like, huge apps that actually need horizontal scalability.


It much closer to the Datastore https://cloud.google.com/datastore/ which also has co-located parent-child tables and GQL instead of SQL, and automatic horizontal scalability


Is this similar to AWS Aurora or is this something else completely different?


Aurora is not globally distributed. Spanner is, and is based on Google research which takes advantage of atomic clocks installed in each server: https://research.google.com/archive/spanner-osdi2012.pdf

Edit: not every server has an atomic clock; see replies by Google employees


There are only atomic clocks in some master servers: "TrueTime is implemented by a set of time master machines per datacenter and a timeslave daemon per machine. The majority of masters have GPS receivers with dedicated antennas; these masters are separated physically to reduce the effects of antenna failures, radio interference, and spoofing. The remaining masters (which we refer to as Armageddon masters) are equipped with atomic clocks. An atomic clock is not that expensive: the cost of an Armageddon master is of the same order as that of a GPS master."

The timeslave daemons running on each machine keep them synchronized with the master time servers, and maintain tight bounds on their inaccuracy.

(Disclaimer: I work at Google)


Each datacenter not server :).


Interesting but without INSERT and UPDATE it just isn't worth it for me. When can we expect it to handle data manipulation language (DML) statements?


"What if you could have a fully managed database service that's consistent, scales horizontally across data centers and speaks SQL?"

Looks like Google forgot to mention one central requirement: latency.

This is a hosted version of Spanner and F1. Since both systems are published, we know a lot about their trade-offs:

Spanner (see OSDI'12 and TODS'13 papers) evolved from the observation that Megastore guarantees - though useful - come at performance penalty that is prohibitive for some applications. Spanner is a multi-version database system that unlike Megastore (the system behind the Google Cloud Datastore) provides general-purpose transactions. The authors argue: We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions. Spanner automatically groups data into partitions (tablets) that are synchronously replicated across sites via Paxos and stored in Colossus, the successor of the Google File System (GFS). Transactions in Spanner are based on two-phase locking (2PL) and two-phase commits (2PC) executed over the leaders for each partition involved in the transaction. In order for transactions to be serialized according to their global commit times, Spanner introduces TrueTime, an API for high precision timestamps with uncertainty bounds based on atomic clocks and GPS. Each transaction is assigned a commit timestamp from TrueTime and using the uncertainty bounds, the leader can wait until the transaction is guaranteed to be visible at all sites before releasing locks. This also enables efficient read-only transactions that can read a consistent snapshot for a certain timestamp across all data centers without any locking.

F1 (see VLDB'13 paper) builds on Spanner to support SQL-based access for Google's advertising business. To this end, F1 introduces a hierarchical schema based on Protobuf, a rich data encoding format similar to Avro and Thrift. To support both OLTP and OLAP queries, it uses Spanner's abstractions to provide consistent indexing. A lazy protocol for schema changes allows non-blocking schema evolution. Besides pessimistic Spanner transactions, F1 supports optimistic transactions. Each row bears a version timestamp that used at commit time to perform a short-lived pessimistic transaction to validate a transaction's read set. Optimistic transactions in F1 suffer from the abort rate problem of optimistic concurrency control, as the read phase is latency-bound and the commit requires slow, distributed Spanner transactions, increasing the vulnerability window for potential conflicts.

While Spanner and F1 are highly influential system designs, they do come at a cost Google does not tell in its marketing: high latency. Consistent geo-replication is expensive even for single operations. Both optimistic and pessimistic transactions even increase these latencies.

It will be very interesting to see first benchmarks. My guess is that operation latencies will be in the order of 80-120ms and therefore much slower than what can be achieved on database clusters distributed only over local replicas.


Spanner's p90s are, for at least one user, consistently lower than 50 ms. https://quizlet.com/blog/quizlet-cloud-spanner

(Disclaimer: I work on Google's cloud.)


> they do come at a cost Google does not tell in its marketing: high latency

Poor latency is fundamental for a CP system, it's kind of a given. Would be nice if they explained it to users though, how achieving consensus is necessary for global consistency and how it is impossible to do that without waiting.


I'd love to see a jepsen test. Maybe the spanner team would be able to sponsor an independent test?


Great product from Google. I wonder what is the difference between Cloud Spanner and Google CloudSQL


Cloudsql is managed MySQL. Spanner is a custom in-house distributed database.


> Unlike most wide-area networks, and especially the public internet, Google controls the entire network and thus can ensure redundancy of hardware and paths, and can also control upgrades and operations in general

I know this is a single system, but I'll still say it. This seems like another step in a scary trend for our internet.


I am an employee, and have my biases, but I’ll always prefer a customer's data stay on our backbone and not be passed through the public internet. Our customers also prefer it, and it's not something other Cloud providers can fully cover.


That probably came off as more offensive than I intended. I just feel a more decentralized internet is a stronger one. Not assuming any nefarious intent.


Why is it a scary trend that Google has made their part of the internet more resilient?


because we keep giving up more and more control to a single Ad company.


The parent comment author seems to be disturbed by the phrase "controls the entire network." Setting aside his unwarranted paranoia, it does make me wonder which organization has the largest and most reliable corner of the internet. Google certainly qualifies by sheer volume of hardware and network infra.


This is no different from a myriad of other companies that have private networks on private fiber that span large geographical paths. Google's is probably bigger than most (both in terms of geography and throughput), but it's nothing new to have private fiber to ensure latency/throughput/reliability.


The sql syntax reference looks similar to that of Postgres' syntax reference.


Actually it is most closely related to the new ["Standard SQL" in BigQuery](https://cloud.google.com/bigquery/docs/reference/standard-sq...).


does anyone know if it works with django or a way to make it working? it should be a matter of a connector, no?


SqlAlchemy engine please :) ?


Neat.

How long until it gets shut down with a month's notice?


> Today, we’re excited to announce the public beta for Cloud Spanner, a globally distributed relational database service that lets customers have their cake and eat it too: ACID transactions and SQL semantics, without giving up horizontal scaling and high availability.

This is a bold claim. What do they know about the CAP theorem that I don't?

Separately, (emphasis mine):

> If you have a MySQL or PostgreSQL system that's bursting at the seams, or are struggling with hand-rolled transactions on top of an eventually-consistent database, Cloud Spanner could be the solution you're looking for. Visit the Cloud Spanner page to learn more and get started building applications on our next-generation database service.

From the rest of the article it seems like the wire protocol for accessing it is MySQL. I wonder if they mean to add a PostgreSQL compatibility layer at some point.


> This is a bold claim. What do they know about the CAP theorem that I don't?

It's right there in the article:

"Remarkably, Cloud Spanner achieves this combination of features without violating the CAP Theorem. To understand how, read this post by the author of the CAP Theorem and Google Vice President of Infrastructure, Eric Brewer."

The post they are referring to: https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...


Why? Strong consistency isn't mutually exclusive with scalability. Google has written about it at length[1][2][3].

Furthermore, there are already more than a few attempts underway to build scalable relational databases ("NewSQL") outside Google.[4]

1: https://research.google.com/pubs/pub36971.html

2: https://research.google.com/archive/spanner.html

3: http://datascienceassn.org/sites/default/files/F1%20A%20Dist...

4: http://db.cs.cmu.edu/papers/2016/pavlo-newsql-sigmodrec2016....


[disclosure: I'm the CTO at NuoDB, one of the NewSQL systems discussed in [4] above]

Totally agree with your comment. While the "NewSQL" category is pretty broad and hard to define cleanly, one common theme is understanding the trade-offs that matter to your applications & operators as you distribute & scale SQL services.

For instance, with NuoDB we've decided that standard ANSI SQL with complete DML/DDL support is a must-have for migrating existing applications. As you distribute the database, the ACID contract you expect with those tools also has to be maintained, and we do that. As others have noted in this thread, however, there are many variants of "consistency." I believe if you're trying to exploit distributed scale, and low-latency through locality of reference, then you're probably willing to trade-off serializability. Using MVCC, logical timestamps (not globally coordinated clocks) and a few other well-understood techniques we provide a scale-out solution that maintains standard levels of isolation and consistency. In practice our customers are traditional enterprises using these capabilities to migrate legacy, high-value systems into modern architectures.

What Google has done is something very few other organizations can do: exploit their massive scale and hardware capabilities to offer something unique. From that perspective, they've also chosen to make trade-offs around syntax, indexing, transaction models etc. There's good commentary from my peer at Cockroach about this already on this thread, but the point is that Google picked a view of the database that makes sense for some applications and not others. Personally, I'm excited that Google has opened up this service and I'm sure it will lead to a lot of good introspection on what we need from modern data management services.

For those who haven't read it yet, I'd highly recommend the Quizlet blog referenced elsewhere on this thread. It's a really good discussion about some trade-offs that Cloud Spanner has made, and why they would or wouldn't map well a given application.

Bottom-line: both Cloud Spanner and NuoDB are examples of the modern, distributed, scale-out systems we're moving towards that focus on consistency, fault-tolerance and modernizing traditional SQL applications.


First time I'm hearing about newSQL thanks for these links.


You may be interested in CockroachDB[1] and TIDB[2], which are open-source newSQL databases inspired by Spanner and F1.

1 - https://www.cockroachlabs.com 2 - https://github.com/pingcap/tidb


Eric Brewer who is the author of the CAP theorem works at Google now. He has a post here: https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp....


What do they know about the CAP theorem that I don't?

I don't know your expertise on the subject, but they do have a post on this topic.

Some highlights:

"Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA."

"The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system."

https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...


"High availability" is not the same thing as the "A" in the CAP theorem. Spanner chooses consistency over availability, but it is a HA system in the sense that it can tolerate datacenter outages.


> From the rest of the article it seems like the wire protocol for accessing it is MySQL. I wonder if they mean to add a PostgreSQL compatibility layer at some point.

It looks like the wire protocol is Protocol Buffers and client libraries will likely use GRPC: https://cloud.google.com/spanner/docs/reference/rpc/google.s...


They write that it's a CP system. So A is not guaranteed, but in practice they are able to provide A most of the time, with a private network and only 1 failure in all of 2016 (whatever that means).


> What do they know about the CAP theorem that I don't?

Their statement, for what it's worth: https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...


> To understand how, read this post by the author of the CAP Theorem and Google Vice President of Infrastructure, Eric Brewer.

Seems like they might know a lot :)


There is a link at the end of the article to a blog post on exactly this topic:

https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...

> The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system.

> However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages.


> This is a bold claim. What do they know about the CAP theorem that I don't?

Your's is a bold claim also :) What do you know about the CAP theorem that Eric Brewer doesn't?


> From the rest of the article it seems like the wire protocol for accessing it is MySQL. I wonder if they mean to add a PostgreSQL compatibility layer at some point.

I doubt it, the spanner is not even offering full MySQL capabilities, so it's unlikely to to support any advanced PG SQL.



Theoretically it means they are giving up on being Partition Tolerant. There was a popular post a while ago about how the P can't be sacrificed. Because if it is... everything else will fail.

Being Google they are probably prideful enough to think their servers could never have an outage. Which yes, I agree with you, that is a very scary claim.


This is thoroughly wrong. Cloud Spanner sacrifices the "A", not the "P." The cool thing being accomplished here is that the sacrifice to the A is greatly reduced (five or more 9s). There are several documents on the subject linked right off that page and elsewhere in these same comments, like this one: https://cloud.google.com/spanner/docs/whitepapers/SpannerAnd...


To wit:

In terms of CAP, Spanner claims to be both consistent and highly available despite operating over a wide area, which many find surprising or even unlikely. The claim thus merits some discussion. Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA. The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system. However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages. For example, given there are many sources of outages for an application, if Spanner is an insignificant contributor to its downtime, then users are correct to not worry about it.

https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp... (written by Eric Brewer, who proposed CAP)


The "choose two" of the CAP theorem is a bit misleading. By creating a distributed system, you've chosen P. So it's really choose one: A or C.


While the product is compelling (acid compliant, horizontally scanning DB), it does seem expensive.

If you use 2 nodes/hour, Cost = (20.9) 24 * 31 = $1400/month not anointing for storage and network chargers.


> $.90 per node per hour

That makes $700 per month. Is this the minumum? or can we have 0 node when the lambda is idle ?


[flagged]


If you don't want this account to be banned, please stop posting snarky one-liners as we've asked multiple times.


I see there's "data layer encryption" but the data is still readable by Google. Why would anyone want to keep feeding the Google beast with more data?

Software is about separating concerns, and decentralizing authority. Responsible engineers shouldn't be using this service.


Amazing! But why does this feel like such a de ja vue all over again.. (surely I'm missing something).. They've spent 5 years telling us that we just CAN'T scale SQL.. Now they'll tell us that actually.. they've figured it out! :)


Given the CAP theorem I wonder what trade-offs they make and how much visibility they give you into these trade-offs.

In any case this is much better than Amazon's offerings... when they actually ship it. :)


Check out this post mentioned in the original post:

https://cloudplatform.googleblog.com/2017/02/inside-Cloud-Sp...

Of note:

They say Spanner is "both consistent and highly available despite operating over a wide area". So not 100% availability but they've got it to "more than five 9s of availability (less than one failure in 1066)."


I didn't pick up on it until just a moment ago, but when you say "They say", it's actually Eric Brewer saying that -- who's most known for coming up with the CAP theorem. I think they've got a pretty good understanding of it!


ha whoops good catch! I knew the post was written by Brewer, but slipped up with the they... yeah I think we have it on pretty good authority that they system doesn't so how violate CAP yet addresses it's challenges.


I wonder how many people will get a seizure from that red-blue blinking rectangle in the video :(

Upd: Downvoting this warning will only increase that number.


> Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA.

It's somewhat ironic that Brewer, the original author of the CAP theorem, is making this sort of marketing-led bending of the CAP theorem terminology. I think what he really should be saying is something in more nuanced language like this: https://martin.kleppmann.com/2015/05/11/please-stop-calling-...

But perhaps Google's marketing department needed something in the more popular "CP or AP?" terminology. I don't see what would be wrong with "CP with extremely high availability" though.

It's certainly wacky to be claiming that a system is "CA", since as the post admits it's technically false; to me this makes it clear that CP vs. AP (vs. CA now?) does not convey enough information. I'd prefer "a linearizably-consistent data store, with ACID semantics, with a 99.999% uptime SLA". Not as snappy as "CA" (I will never have a career in marketing I suppose), but it makes the technical claims more clear.


You're omitting what immediately follows:

> The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system.

> However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages. For example, given there are many sources of outages for an application, if Spanner is an insignificant contributor to its downtime, then users are correct to not worry about it.


My point is, why call this "CA", if it's not CA? Especially when you immediately go on to explain why it's not really CA?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: