Hacker News new | past | comments | ask | show | jobs | submit login
Pick strong consistency whenever possible (googleblog.com)
390 points by rbanffy on Jan 15, 2018 | hide | past | favorite | 181 comments



Slightly off topic but articles like this make me weep as I look at my infrastructure and think about all the best practices that we by all means should be following, but aren't because of a senior engineer stuck in the 1990's and is seemingly not only paralyzed by analysis (and bikeshedding), but seems to subscribe to the notion as a way of living, and management who defers to him at every turn despite all evidence of how it holds us back (not to mention directly contributing to outages). I'd give my first born child for immutable servers.

Consistency in database systems refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules

I would LOVE for this to be a thing. Just as soon as I can convince people that maybe storing my.cnf in VCS is a better idea than updating it in prod. Live.

Sighs.


>I would LOVE for this to be a thing.

It is a thing, just not at that company.

Seriously, get a year or two, then look for escape options.

There's a Mt. Everest climb's worth of distance and effort between "it works" and "best practices", and like successfully climbing Everest, very few companies are doing all of the best practices, and I suspect your dude going the opposite direction by digging with a spoon.


Seriously, get a year or two, then look for escape options.

Been a year and a half. Have a 2nd round interview lined up for Thursday to get off this crazy train.

I also happen to be reading The Phoenix Project. Which is...strangely therapeutic ;)


There was a job when I was still green where they had fired an architect and they were bringing in more hands on people with a hope someone would step up. Someone opined that architects have a half-life of about 12 months. I ended up splitting lead duties with another person and she and I split the architectural oversight.

Since then the same patterns (alas, minus the diversity) have worked out a number of times and I’m adamant that adding another lead is far better than consolidating that under one person, at least until you have half a dozen leads.

In fact my most useful mentor and boss said to me look, I trust you guys to sort this out. I want to hear the plans and ask questions but I only need to be involved when you can’t come to an agreement. And then I can be your tie breaker.

I think a lot of things would go better if folks in these positions would audit and break ties and worry about bigger picture stuff most of the time.


In fact my most useful mentor and boss said to me look, I trust you guys to sort this out. I want to hear the plans and ask questions but I only need to be involved when you can’t come to an agreement. And then I can be your tie breaker.

I had a boss like this and very much believed in that sort of autonomy, only getting involved-as you said to be the 'tie breaker', but still lead a great engineering team and was a great source of knowledge and mentorship.

I say 'had' for a reason. That reason is mentioned in the post that started this comment chain. Now what's that tell you? Heh...heh :(


I had a different boss who trusted us implicitly but essentially we were managing ourselves. Unlike Mr Mentor we didn’t talk shop and he didn’t challenge us very much or at least concretely to do better. In fact I was only ever aware of him doing about 20 hours of work a week and half of that was meetings (the other half administravia).

Given the politics there I suspect the rest of his time was kissing butts or covering them. Since the end of the dotcom era I’ve never seen a more high minded bunch of Emperors New Clothes types all in one spot. Woof.


> I also happen to be reading The Phoenix Project. Which is...strangely therapeutic ;)

Huh, I started reading that earlier today. Between the book and replaying levels in Infinifactory, the words "bottleneck" and "backpressure" loom large in my mind.


Some people get ahold of tunnel boring equipment and go as fast as possible in the wrong direction. With this mountain in the way we can’t tel if there’s anything cool on the other side, but damn are we hopeful.


Not really related to your main point, but speaking as a greying "old guy", one of the things that's really hard is to gracefully allow the next generation to contribute. I still vividly remember starting out and being incredulous about some of the stupid ideas the senior guys on the team had. I was proven right quite a lot of the time too. It's easy to build that inner narrative of the hero righting wrongs.

But then fast forward to the back end of your career and you've got a kind of weird dilemma. The vast majority of ridiculous things that the young guys try to push is absolute crap. They read about it in a blog post somewhere and don't have nearly enough experience to understand the nuances. Or they have yet another stupid f-ing framework that they just know is going to revolutionise the industry. Yeah... been there, done that, my closet is overflowing with T-shirts, thank you very much.

You sit there thinking, "I spent 20 or 30 years fighting my way through this crap and when I finally have a good handle on it, these new guys are just going to throw it all away". But... Dammit. You absolutely know that one of them has a great idea and you're not quite sure which one of them has it. Because they all think they have it sorted (even when none of them do). They feel they are completely sure that "this is the best way and only a moron couldn't see it". But you know that it's always more subtle than it appears. If you let them all run with their ideas, then 90% (99%???) will crash hard... Let's face it -- just like you did when you had that stupid idea that you were absolutely certain was "the way forward".

Anyway, I belabour the point. From my vantage point, it's pretty difficult to let go. Some people just can't do it. In my position now, I have a lot more sympathy for the "old guard" than I did when I was younger. To be honest, I think it is technically impossible for me to have less sympathy than I did -- and I think this is pretty common in the industry. Probably both sides of the equation are in a similar situation -- neither side feels comfortable letting the other side make decisions.

Just talking out of the hole in my head right now, but I think the main thing is that programmers generally like to program. They like to have freedom to try approaches that they think will be successful. As much as possible, it's probably best to allow your teammates, whoever they are, to have the freedom that they need. At the same time you should expect to have your own freedom. Getting the balance is not easy and sometimes you have to change jobs in order to get into a team where you are allowed the freedom to do your best. In my experience, teams that take this seriously are the best teams to work on (even if they sometimes do stupid things).


You know, I really appreciate this response and perspective; because it makes total sense. It's human to want to stick to what you know and only natural to be hesitant of the unknown, especially when you have the tenure to have seen it all over the course of a career. That experience is valuable and it's why in many cases I do lean upon this senior guy for input on specific topics I know he is the best resource for.

That said, in this specific context of tech (or at least in my wheelhouse of Ops), this is surmounted by creating and fostering a culture where it's okay to experiment and fail.

Where we've been handicapped in this regard is specifically not doing that, and the particular individual who makes this difficult having the mentality that any new approach spells doom and actively working against initiatives to put us in a better posture of experimentation in environments that don't sacrifice stability and reliability of our stack (hence the sarcastic comment about giving up my first born child for server immutability--I don't even have kids, haha).

But that said, I think you're absolutely right. I'm not quite greybeard yet-mostly because I yank out every grey hair I see trying to establish residency on my chin, but nonetheless your point is accurate, well taken and definitely that has immense merit.


And if you allow experiment and fail, you end up with 20 different frameworks in your app.

And that means a huge cost of inconsistency. Every time you need to make a little tweak you have to learn how that part of the app works.

When you just want to get shit done, being able to go into any part of the app and knowing it's built exactly the same way as every other part means you don't have to spend any time learning insert X framework, you spend time doing, implementing business rules and fixing bugs.

A lot of programmers want to spend time playing with tech, not doing because playing with tech is more fun then doing the job.

Worst still, juniors (including me back then) are woefully unaware of just how clueless about programming they actually are, as alluded to by the parent, and that means they have massively over-inflated opinion of their views. They also tend to be wildly optimistic about how long a change will take and totally disregard the human cost (the cost of re-training and loss of productivity) of changing a piece of tech.


This is something I think about a lot nowadays. You have to try and fail a lot to develop an intuition for good design. How do you give people the room to experiment without paying for all the mistakes? It is tremendously expensive to produce a senior developer yet vital to do so because junior developers mostly produce crap. We could only assign them projects off the critical path, but we can't really afford to have anyone working on anything but the critical path. Plus it's a lot to ask of someone's ego to say, here work on things that don't matter for 5 or 10 years and we'll talk.


I was randomly watching Youtube the other day and there was an ad for Herbie Hancock's Masterwork class (ha ha -- not a referral, no idea if it's any good). Anyway, in the ad he mentioned that he was playing with Miles Davis one time and that he made what was technically a mistake. Miles listened to it, picked it up and made it right. Suddenly I realised: "That's it! That's what I want to do." But, of course, it's much easier said than done. I won't pretend that I'm able to do it yet :-)


You do it by modularising code into smaller parts that communicate in well defined ways and enforce that. Then you can take whoever and have him make one module with a lot of autonomy. If he fails and make mess of it, someone else (or even more experienced himself) will rewrite it relatively easy.

Meanwhile, you let him make decisions in his module/plugin/part/web service/whatever while you enforce rules between them.

Also, the work off critical path matters too, it would not be done at all otherwise, so treat it like it matters. It won't take juniors ego if older collegues take that job seriously.


That's what "side projects" are for. Go and experiment as much as you want. Then after you've succeeded, make a presentation to technical management showing what you've learned.


I'm the oldest guy on my team and have an architect title, yet I'm having a very hard time getting my younger colleagues to adopt CI/CD, microservices and agile. Some of them know what they are and that they sound cool, but don't know how to do it. Some of them actively oppose it.


"Sound cool", do you have a better reason to change the way your colleagues make a living other than it sounds cool?


I'm saying the team knows they "sound cool". I have lived them all and know exactly what they're good for and why.


Usually, if you define requirements you can make an objective choice. The old guard is too attached to stuff they are fluent and young people follow hype too much.

As the technical field, we do not follow the scientific method.


> As the technical field, we do not follow the scientific method.

Surely we should be looking to engineering here, not to science.


This is ageism plain and simple.

You clearly have no clue what was the predominant view in 1990s. Consistency wasn't just _a_ thing in the 1990s. It was a core dogma even beyond database systems. It was top of the agenda for a while until it sort of fizzled out in the early 2000s when distributed transactions hit a complexity wall.


How do you reduce transactional complexity by throwing out consistency? Genuine question. At the very least, it seems like you need to mark potentially inconsistent data as dirty.


I didn't mean to imply that you could. But the simplicity that consistent distributed transactions can deliver in principle can easily get drowned out by accidential complexity created by a specific implementation.

For instance, you have to get all participants in a distributed transaction to support the same standard protocol without too much bridging, wrapping, special interface description languages, tons of configuration metadata, etc. You have to agree on a small stable set of features that everyone implements in a robust way.

But the major corporate sponsors of the competing approaches originating in the 1990s and early 2000s got disrupted before any one of them could gain enough traction to bludgeon everyone else into compliance.

My point was merely that almost no one in the 1990s doubted the necessity of consistent transactions. Eventual consistency and all the rest only became popular with NoSQL and the general desire to imitate internet giants that didn't even exist in the 1990s.


Sounds like said "senior engineer" is an example of what I discussed in response to https://news.ycombinator.com/item?id=15975130.

Don't let it happen to you.


> Slightly off topic but articles like this make me weep as I look at my infrastructure and think about all the best practices that we by all means should be following, but aren't because of a senior engineer stuck in the 1990's and is seemingly not only paralyzed by analysis (and bikeshedding), but seems to subscribe to the notion as a way of living, and management who defers to him at every turn despite all evidence of how it holds us back (not to mention directly contributing to outages). I'd give my first born child for immutable servers.

I know that feel although in my case its non-technical management who just want things to be frozen in time.

Cest la vie at some companies unfortunately. Good luck with your escape plan. :)


What does the 1990s have to do with it? The 90s were all about data consistency (probably too much so).


> all the best practices that we by all means should be following, but aren't because of a senior engineer stuck in the 1990's and is seemingly not only paralyzed by analysis (and bikeshedding), but seems to subscribe to the notion as a way of living, and management who defers to him at every turn despite all evidence of how it holds us back (not to mention directly contributing to outages)

Huh, that's exactly how I feel about the "must use a RDBMS for all data" senior architects I've worked with.


>> storing my.cnf in VCS is a better idea than updating it in prod.

Eh, 'but we can't store things/API keys/passwords in VCS! It's bad!' is a pet peeve of mine. No, let's not 'just update it manually when we need it'. No, if outsider getting your DB password (somehow) poses existential threat... Chances are that you have way more concerning problems.


Storing passwords and other secrets in VCS is a phenomenally bad idea, particularly when your team size grows past two or three people.

All it takes is one fuckup of accidentally committing to a public repo. Or firing someone on the team but not thinking to rotate secrets. Or hiring an outside contractor that now you implicitly give access to your production AWS credentials, etc.

Secrets do not belong in source control, hard stop.


I don’t understand why secrets should be excluded from source control. It seems like a perfectly fine place for them to be stored to me especially when talking about secrets which the developers require in order to develop the code or maintain a set of systems.

If the secrets in source control are:

(1) encrypted

(2) never have decryption keys stored/loaded on any developer machine

(3) never have stored decrypted representations (only in memory representation of decrypted forms allowed when required).

If you follow these rules you won’t be more likely to accidentally commit unencrypted versions of the secrets and you’ll also by necessity have setup some (auditable) gatekeeper for logging of decryption events (via an aws kms or similar decryption as a service api).

For the category of secrets which must never exist in any decrypted form on a developer machine — maybe I can see the argument that those should be left out of source control as this would represent a reduction in the surface area for offline attacks against the encrypted form of the secrets -— but I would guess that this actually represents a somewhat minor gain in practice? And also seems strictly unimportant to use cases where access to the decrypted representation of the secrets on the developers machine is mandatory ...


It's so easy to avoid having them in VCS though. Any secrets can be loaded from a configuration file placed in ~. Populated configs can be selectively distributed so you can e.g. only place the production config file (with the prod server AWS keys for example) on the CI server. Everyone else just gets a config file with non-prod secrets. This can help avoid a lot of mishaps.


The problem with config files outside of vcs is keeping them in sync, which leads to bugs, insecure sharing methods, and constant interruptions as developers need to bug each other to get the latest values.

Shameless plug, but check out EnvKey[1].

With 10-15 minutes of effort, it gets secrets out of git, out of email, out of Slack, etc. It manages encryption keys safely behind the scenes, protects development secrets just as strongly as production, keeps devs and servers automatically in sync, and greatly simplifies access control and key rotation.

It's not the only solution out there, but it's by far the easiest to setup and work with. In any case, use something that truly solves the problem! Don't settle for half-measures that end up spraying secrets all over third party accounts. This stuff is serious--even when it comes to so-called development secrets, the line is fuzzy.

1 - https://www.envkey.com


Shameless or not, this is the first time I’ve seen a hazmat[1] handler I understood by the end of the first page and the price is low enough I’m just signing up now to use it for all my personal projects. Thank you for sharing it here.

1 - encryption keys and secrets are “hazardous material”, shortened to just “hazmat”. While necessary and arguably crucial in our work, they deserve the same care and respect a chemist would have for a beaker full of particularly dangerous chemicals.


Glad to hear it! Feel free to reach out with any feedback/questions/issues: dane [at] envkey.com


Also, it really depends on the environment, the nature of secrets, etc. It's never great, but the level of badness varies dramatically.

If you have a team of 2-3 people, where everyone has access to everything anyway, and the secrets can't be used from outside the company (i.e. you need access to servers in the first place to authenticate with whatever service is configured), then it's one thing. It's bad, but it sure beats 'let's really carefully update it on all machines manually'.

I just really, really don't like blanket statements such as 'you shouldn't do that, full stop, context is irrelevant'.


Let's assume that all the above is true.

Even then, secrets don't even belong in source control. They aren't tied to a specific version of your software -- they're tied to state in other systems: AWS, your MySQL database, a third-party API, etc. Those systems will change independently of any particular release of your code, and having it versioned in source control doesn't actually make sense.


They are tied to the overall state of a application though. Much like db versions, 3rd party apis etc.

You must track them somehow and a VCS is obviously the right thing to store them in.

Depending on the circumstances, I use git-encrypt, Ansible vault, vim -x (encrypt files) or even RCS (no chance of pushing to a repo when RCS doesn't have that feature).

This one of my criticisms of the 12 factor app. They say put the config in the environment. That doesn't address the question. How and why does it get into the environment and from where?

https://12factor.net/


Secrets belong wherever it makes the most sense for them to be, balancing operational needs, security, and compliance requirements.

Storing secrets in version control (with permissions to the repo tightly controlled), encrypted (with the decryption key only available on the compute needing secret access), and used for rendering into a discovery mechanism (or directly injected into instance env vars) is entirely legitimate (and I know of several orgs doing this).


It is wholly possible to store configuration data in vcs in a way that doesn't sacrifice depth of security when it comes to secrets and passwords.

But otherwise yes, storing passwords in vcs is bad. That wasn't necessarily what I meant to imply by suggesting storing configuration files in vcs, however. I probably phrased that poorly.


Related to the overall topic but not Spanner specifically: it's interesting to compare the consistency models of S3, GCS and similar offerings:

* S3: https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction...

* GCS: https://cloud.google.com/storage/docs/consistency

* Azure: exercise left to the reader

The best usages of S3 I've seen are designed to only use the one major form of strong-consistency that S3 guarantees: if you write a key that has never been written to or read from, and then do a read, it will always work. Sticking to this guarantee will avoid nasty difficult/impossible-to-diagnose bugs. To take advantage of this you do things like:

A. Use unpredictable keys (e.g. prefix with a GUID) to guarantee that you don't get EC behaviour, communicate those keys to other services/users after the upload has succeeded B. Store immutable/content-addressable data. Use new keys for new versions of the data (don't ever over-write a key.) If necessary, use an external (strongly-consistent) index to map identifiers to S3 keys. C. Avoid "upload if not exists" in favour of just uploading (e.g. in the case of a possible failure just reupload, don't check if the key exists on S3. Uploads are atomic.)

GCS offers a far greater amount of strong-consistency and this allows makes it much more flexible.

The key to scalability/performance/simplicity of services like S3 is lack of transactions between keys, not eventual consistency. It's informative to start from Paxos and figure out how you'd build an S3 and why it might exhibit EC behaviour like S3 (hint: a caching front-end) and question what value that really brings vs. the extreme decrease in usability (consider that a caching front-end doesn't give you scale-out in the way that lack of cross-key transactions does.)


> The best usages of S3 I've seen are designed to only use the one major form of strong-consistency that S3 guarantees: if you write a key that has never been written to or read from, and then do a read, it will always work.

The same applies to OpenStack Swift, a similar, but self-hosted product. Like S3, when it comes to the CAP rule, they choose availability and partition tolerance.

Source: I'm running Swift clusters at work, and developing applications on top of it for customers and internal use.


Applying CAP to come up with useful/practical conclusions is a lot harder than it seems.

The fact that S3 offers the form of strong-consistency that it does means that S3 is already suffering most of the "problems" that strong-consistency implies. If S3 suffered a particularly bad network split, it may not be able to serve reads because it can't be sure that a write didn't happen elsewhere. This obviously applies to new keys (due to their documented guarantees.)

To demonstrate why CAP doesn't say enough to be interesting: imagine building an S3 by starting with etcd (which is based on Raft.) etcd is CP but has scalability/availability issues, so it won't scale out well. Now run multiple independent etcd clusters and shard your keyspace onto those clusters. Now you have arbitrary scalability and your availability is better than a single etcd cluster. Now of course if you were building an S3 you wouldn't actually run it like this (the same process could participate in different Paxos/Raft logs at the same time) but that's the high-level idea.

My point: You can get arbitrarily more availability and throughput without sacrificing strong-consistency, except in some really odd scenarios (no machine can talk to another machine, for example, and you must replicate to multiple machines.) CAP is about 100% availability (vs. 99.9999999%) but some scenarios are dumb and 100% isn't usually necessary. The key to this is that keys in S3/GCS are independent; there are no cross-key transactions.

Eric Brewer (coined the term CAP) wrote about this with respect to Spanner here: "Spanner, TrueTime & The CAP Theorem" https://static.googleusercontent.com/media/research.google.c...


I should note that the demonstration is a toy example, you don't actually want/need to store blob data in a paxos-like log directly. For an in-depth look into what it takes to build a strongly-consistent S3-like system (and more) you could check out the paper "Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency". A link to the paper, a talk and slides is on this page: https://blogs.msdn.microsoft.com/windowsazurestorage/2011/11...


How to get rid of old data if you (basically) treat S3 as append-new-files-only-store?


That really depends on the application, who has access to the keys, how they learn about the keys etc. There isn't a one size fits all solution and that's kinda why eventually consistency is a pain in the ass.

The "I want strong consistency but have to use S3 because its the cheapest way to store stuff and its great at serving reads and just otherwise great" option is to maintain an external index that maps between persistent ids (e.g. file paths) and S3 keys. We use this for hosting user content at my company. Our index is in MS SQL of all things ;)

The MVP option is to not to delete stuff. :) This can work well in some scenarios (e.g. you want to keep all versions around anyway) but be careful to not dig a hole that's hard to get out of.

A more esoteric option is lifecycle based cleanup. This can work well for things like a build artifact cache (where an expired key isn't a big deal because you can regenerate it.)

Some of this is discussed in this blog post by Netflix: https://medium.com/netflix-techblog/s3mper-consistency-in-th... (caveat: s3mper is dead(?) and there is an AWS-native solution for HDFS now.)


Possibly an obvious comment, but reading your post it struck me that any way you slice it, you're more or less building a garbage-collection system, with all the various tradeoffs those entail. Different performance issues since it's S3 buckets rather than RAM, but still very dependent on who has references to what, and when you check for it. Your two examples sound to me like: You can have some set of known exhaustive roots and periodically delete anything not still reachable from them. Or you could have a lifecycle option, analogous to arenas (region-based memory management). Etc.


https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifec...

If you treat S3 as a "I only write new files to you and I want my files to have a TTL," Object Lifecycle Management is the feature for you.


  For example, a financial application might 
  need to show users' account balances. When 
  users make a deposit, they want to see the result 
  of this deposit reflected immediately when 
  they view their balance (otherwise they 
  may fear their money has been lost!). There should 
  never appear to be more or less money in 
  aggregate in the bank than there really is. 
Why does this example continue to come up time and again, banking absolutely doesn’t work this way. We feel like it should but deposits / transfers can take days to post, bank balances can absolutely reflect “incorrect” values, etc.


"In other news, GOOG jumped by 7% today upon speculation that Google would soon introduce a banking product offering instant transfers between accounts. The speculation arises from a technical blog post, which describes Google's vision for how a banking website ought to look like, and stood out because no American banking product actually works in such a fashion today. Google's new banking product would compete with banks like Monzo, which also attempt to disrupt the legacy systems that power most contemporary banking infrastructure."


Because banking in some couuntries actually does work like that.

In Europe, since November 2017 banks are moving towards the SEPA Instant Credit Transfer system, which allows transfers in under 10 seconds, between all banks.

So if you’re today using a European bank, your transfers might actually work like this.


It could be under 100 ms, but that still doesn't tell you the consistency model.


Requiring by law transactions to complete (and show up!) in under 10 seconds, EU-wide, provides some requirements for a consistency model, though, that eventually consistent models can not provide.


Banks do post current balances which reflects how ledgers work. The current balance also reflects transactions that havent posted and that seems to be in line with the example


Banks use eventual consistency for your balance. You just don't notice because it's usually really quick.

The reason the ATM has a limit is because it's eventually consistent. It's possible you just went to another ATM and drained your account, but that hasn't reached the main office yet.

They are limiting their risk by allowing you to defraud them of up to $500 (or whatever your limit is).


Do you know that, or is it your conjecture? There are a lot of other possible reasons to use $500 as a limit. For example, preventing people from structuring withdrawals at the ATM, preventing an accident from producing too much cash, limiting the frequency with which ATMs need to be serviced, and limiting the downside of unauthorized withdrawals.

It is true in some sense that the banking system is eventually consistent, because commands can be issued asynchronously. For example, you can issue a check, and expect it to be drawn in the future. However, that does not necessarily apply to the ledger itself, and given what I understand about banking technology, I'm somewhat surprised if it is truly as you describe.

If it is indeed a conjecture of yours, I urge you to make it more clear in your comments when you are speculating and when you are speaking with authority.


I know that. I've talked with people who work on banking systems who have confirmed it.

It's true there are other reasons for the limits, such as the ones you listed, but if you are with say Bank of America, your limit and my limit can be different. The more credit history you've built up with the bank, the higher the limit will be. This is mostly because they trust you more and therefore consider the risk for defrauding smaller.


It takes a fraction of a second to send the withdrawal to the main office - how would a scam based on this really work? A second card with another person at a different ATM? Even then, you'd think you could serialise per-account without affecting throughput at all.


I'm not sure but I've seen Visa terminals that behave as if they enter some kind of offline mode when their ancient GPRS connection in the middle of the woods time out. In this mode you can only do smaller purchases and they are then later sent to the bank once the store gets a connection. This is for Visa debit btw, not credit.


Sure, a fail-safe mode makes complete sense. I'm just doubtful that there's a need to be in this mode 100% of the time.


Here is a news article from 2013 about such a scam:

http://www.nytimes.com/2013/05/10/nyregion/eight-charged-in-...


There's not really enough detail here to support either point of view. There's hacking & skimming in addition to hitting withdrawal limits. It's arguable form the article's information that implicitly trusting up to the withdrawal limit was partly to blame. I'd prefer to see some evidence that it's necessary to put this trust in place at all.


> It takes a fraction of a second

99.999% of the time? It's impossible to guarantee that 100% of the time.


And it would still be acceptable to wait 0.001% of the time, would it not? Or degrade to limit mode in that situation?


It's worth noting that in general there's no "the banking system" (unless we're talking about particular single systems e.g. Fedwire), there's a mismash of many different, highly diverse systems used by different organizations having quite opposite properties - for example, in the context of the comment above, I happened to participate in a cards system migration some 15 (?) years ago where the old system did have the eventual consistency issue described above (mostly mitigated by limits, there were just a couple times where double-spending happened), and the new system ensured strong consistency. So it can be one way, and can be the other.


I have worked at a bank and they definitely didn’t use eventual consistency for transactions. Money could be in flight when it transitions inside the bank, but never to client due to fraudulent behaviors, regulations etc. However not all transactions were not dealt with electronically and they were treated differently and mostly outside the systems until effective.


I really do not think this statement is true. AFAIK the limits are in place to limit fraud (if someone has skimmed or stolen your card) and to prevent theft (someone mugging you as you withdraw a huge amount of cash).

If what you are saying is true it would be possible to go overdrawn at an ATM. This (again AFAIK) doesn’t happen as there is a call to check your balance.


The banking system is rather complicated. I think it would be possible to overdraw in some situations.

See also this article where hackers stole $45 million from ATMs:

http://www.nytimes.com/2013/05/10/nyregion/eight-charged-in-...


> If what you are saying is true it would be possible to go overdrawn at an ATM.

It is most definitely possible, it just rarely happens because the network doesn't fail often.

See here for more sources: https://www.google.com/search?q=eventual+consistency+atm


"A strongly consistent solution for your mission-critical data

"For storing critical, transactional data in the cloud, Cloud Spanner offers a unique combination of external, strong consistency, relational semantics, high availability and horizontal scale. Stringent consistency guarantees are critical to delivering trustworthy services. Cloud Spanner was built from the ground up to provide those guarantees in a high-performance, intuitive way. We invite you to try it out and learn more."

Is this a blog post or an advertising whitepaper?


From the article:

> To quote the original Spanner paper, “we believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions.”

Isn't this a strong argument against the current Microservices trend?


Yes. Microservice-first is and always has been a terrible idea; spinning out microservices as bottlenecks arise is much better. Most successful microservice systems were monoliths first.


Microservices are about organisational structure, not performance. The primary difference over modularising via libraries is independent release cycles. Every time you update your libraries you also have to update all your applications even if the change is backwards compatible. Microservices avoid this problem.


That sounds wrong.

First of all, backwards compatible changes in libraries don't need application updates, by definition.

Second, if your library changes are backwards compatible, you can maintain separate release cycles - this is the normal way of doing things, actually.

Third, microservices simply don't allow backwards incompatible changes. This is a big hint that perhaps your libraries shouldn't do backwards incompatible changes either, as a matter of policy, even though doing them is theoretically possible.

Perhaps you meant deployment instead of releases, and were assuming static linking?


Probably what the parent post means by "backwards compatible changes" is that the app doesn't need any changes to use the new behavior of the library, but it needs to be rebuilt/repackaged/redeployed to get that new behavior.

An often seen scenario is that small changes (a one-line bug fix, a tweak in behavior) in application functionality happen to be in the shared library code, so to get that fix to production, you need to redeploy all apps after the library change. Microservices allow you to redeploy just that single shared service.


> happen to be in the shared library code, so to get that fix to production, you need to redeploy all apps after the library change

Doesn't this simply mean that the deployment process is somewhat broken?

On a typical Linux distro, you run "apt-get" or "yum" to upgrade your Heartbleed-fixed OpenSSL, the package manage automatically restarts all affected daemons, and you're done with it.

Even if you deploy static images, a good build system should just re-run the linker after the library change and be done with it.


We seem to be talking about quite different things - in the context of discussion about monolith vs microservice architectures, the expected (at least to me) context is deployment of in-house applications within an enterprise.

This means that even on Linux distro's the packages generally are not managed by apt-get or yum. This means that "shared libraries" don't necessarily mean Linux/C shared libraries, but are likely to be packages of Java/PHP/C#/whatever code that are re-used by multiple applications. This means that, for starters, you need to choose when and how you'll push this update to many different machines each running different applications. This means that "automatically restarting all affected daemons" isn't trivial - some of these services may have state (e.g. a deployment that also includes a change to a database), for some of them the downtime matters, you likely don't want to stop&restart all of the instances doing the same thing at once but rather do a rolling deployment. This means that some of the instances that need to be updated are out of your control/administration - you need to send the package to some other organization so that they can re-deploy it. Etc, etc.

Sure, a good build system should just re-run the linker after the library change and be done with it - that solves the packaging part. Deployment often is much trickier; some organizations have a good, robust deployment process but many (most?) don't.


In my experience, projects which try to follow that strategy inevitably end up with a handful of core libraries containing 90% of active development. The incentive gradient doesn't work; if a project is time-sensitive, the developer just won't agree to do twice as much work to maintain separation of responsibilities.


> Microservices are about organisational structure, not performance

The above statement is also not about performance. (It is about keeping "consistency by default" even if you do have performance critical areas.)


Am I the only one reading this on a mobile? What?


I read it on mobile as well -- but I found the content really compelling and struggled through decoding it. At some point I invented a reason that they might've done this on purpose -- my brain's elevated error-rate while decoding sometimes produced wholly incorrect sentences (which I would hopefully correct by re-reading). These errors are sort of analogous to the kinds of timing dependent semantic errors you can get with distributed systems operating on data stored with weaker consistency models...

So -- google art project?


Wow. 1 letter wide columns!


Ironically for an official Google blog... it renders fine on Firefox for Android, but not Chrome.


Firefox, Chrome and Safari all broken on iOS.

Google could catch this automatically...


You read that? I took one look and noped out.


What is the AWS equivalent to Cloud Spanner?


There is none. DynamoDB would be the closest - it's a HA DB for OLTP workloads (but without the transactions). AWS do not have the TrueTime API that Spanner builds on to provide globally consistent transactions.

AWS's storage offering S3 is pretty limited in terms of its consistency model: https://medium.com/@jim_dowling/reflections-on-s3s-architect...

In contrast, Google's filesystem (yes, not an object store, but a filesystem) is hardly known - Collosus. It provides a HA consistent metadata layer to a distributed filesystem. The closest open-source architecture is HopsFS (distributed metadata for HDFS) - disclosure I work on Hops.


I would add to this by saying that AWS Aurora "looks" the most similar to Spanner by being a transparent scale-out relational database with strongly-consistent transactionality, but it doesn't seem to be designed for geo-replication.


Agree. AWS aurora is closest to spanner and would work for 95% of use-cases... master-master, cross-colo or geo-replication is great problem for less than 2% of companies...


Multi-master Aurora is actually in Preview right now[1]. Only for the MySQL variant of Aurora though, not the Postgres variant.

[1] https://aws.amazon.com/about-aws/whats-new/2017/11/sign-up-f...


AWS Aurora supports cross region read replicas. Is geo-replication something different?


Isn't DynamoDB more in the direction of Google Cloud Bigtable/Datastore though?

If I'm right with this, then this is basically stating: "yeah, aws also has a high scale database offering" but it's not really comparable anyhow.


Actually, AWS recently released a time sync service that anyone can use so you could possible build it: https://aws.amazon.com/blogs/aws/keeping-time-with-amazon-ti...


AWS tries to cover this strategically with Aurora (as well as Serverless Aurora) and DynamoDB.

Serverless Aurora can autoscale up and down based on usage, and Aurora has powerful replication functionality, but this is a far cry from Spanner's multi-master horizontal scaling.

DynamoDB can autoscale (crudely), and they just added multi-region multi-master. It's non-relational though and requires more capacity planning than Spanner.


I mentioned it in another comment, but along with Aurora Serverless they also announced Aurora Multi-master in Preview[1].

No clue how far it is from General Availability or if it allows even remotely close to the horizontal scaling of Spanner. But it’s at least being worked on.

[1] https://aws.amazon.com/about-aws/whats-new/2017/11/sign-up-f...


No mention of partitioning/sharding.


CockroachDB + AWS Time Sync.


gcloud also has an atomic-clock-backed NTP service: https://developers.google.com/time/

Neither provide the guarantees of TrueTime.


This might be interesting: https://www.cockroachlabs.com/blog/living-without-atomic-clo...

Excerpt: "...The designers of Spanner call this ‘TrueTime’, and it provides a tight bound on clock offset between any two nodes in the system. TrueTime enables high levels of external consistency. As an open source database based on Spanner, our challenge was in providing similar guarantees of external consistency without atomic clocks."


How does Spanner stay competitive with other NewSQL databases? Doesn't the pricing and the API lock-in mean that people will just consider open-source NewSQL databases first?

(Not a shill for any company, honestly just curious.)


It's really hard to compare database performance, but it looks like Spanner probably is at a pretty competitive price-performance point.

Google advertises that a $0.90/hour node can do 10,000 reads per second and 2,000 writes per second. Amazon wrote a blog post comparing Aurora and PostgreSQL performance and got ~38,000 TPS on Aurora (vs. ~18,000 on PostgreSQL) on a m4.16xlarge, but that would cost $9.28/hour (https://aws.amazon.com/blogs/aws/amazon-aurora-update-postgr...). Given that you could make a 10-node cluster for that price with 100,000 reads/second, it seems to be reasonably price competitive.

Even if you were going to run instances yourself, a 16xlarge costs $4.256/hour and 4 spanner nodes seem like they'd be competitive with PostgreSQL (or even Aurora) on that hardware.

When you're building your business, you make trade-offs. Spanner means high-uptime and throughput without worrying about ops and scaling. At my company, we run our own database clusters, but we also have teams of people working on them (Spanner and RDS didn't exist when the company started). Do you want to have to worry about how you'll partition data and make sure everything stays consistent? Do you want to have to write more complex code to accommodate for less consistency? That will mean you're slower to ship features and those features will be buggier. Do you want to hit scaling limits where there isn't a larger EC2 instance to put your database on? Do you want to have to deal with growing those instances?

CockroachDB is meant to handle many of these things, but it's young and you're still going to need to deal with running it. Databases (at scale) can provide all sorts of subtle trouble. For example, in one of our clusters, we lost a node and the cluster decided to rebalance things. But it didn't have the IO capacity to rebalance and started grinding to a halt. This was with a very widely used NoSQL/NewSQL store. Do you like tracing and heap dumps?

The issue with running your own cluster isn't when things are going well, but when things start going sideways.


I'm much more likely to use Spanner than I am to operate my own installation of something like CockroachDB. To operate a database:

1. I need to provision N nodes across regions and availability zones

2. I need to patch those nodes

3. I need to back those nodes up

4. I need to build automated systems to rotate those nodes if they become unhealthy

5. I need to build automated systems to restore my entire database from backups in event of disaster

With Spanner, I don't have to do any of that, really. Maybe a bit of fiddling with replicas or something.


For many businesses, pricing is much less important than business continuity, in the sense that they are willing to pay more in the interests of getting "the name brand", so to speak. Why would they risk another RethinkDB by choosing a startup when they can go with Google's dogfooded version?

(This assumes they're already interested in outsourcing their databases in the first place.)


My guess is that the operational complexity of running a globally-distributed, fault-tolerant, performant SQL database surpasses the Spanner monthly bill (currently $0.90/h, or ~$675/mo, for a single node; $3/h or $2250/mo for a multi-region instance). Definitely not a side project kind of money, but think of the engineer-hours you'd put into maintaining such a setup in production -- plus the hardware costs.


Also note that a minimum of 3 nodes is recommended for production environments. So it's $2000/month at least to get started.


That's still a fraction of a competent database person, so it's still a deal. The downside is its limited feature set. There's no fixed-precision decimal type, and it's a pretty old SQL implementation.


Spanner is Google's private in-house database so it's not competing with NewSQL vendors.

Edit: Late to the news that Google is selling its spanner warez.

Well it's been in the game longer than anyone else and is battle tested by Google.


> so it's not competing with NewSQL vendors.

But it... literally is? https://cloud.google.com/spanner/


Spanner is not private.


>> How does Spanner stay competitive with other NewSQL databases?

Good question, but... what other NewSQL databases we are talking about here? That is, NewSQL databases that offer strong consistency, and multi-region replication. I do not see many of them.


Every database has API lock-in.


I would say that Spanner is "more" lock-in than, say, an open-source database that's mostly wire-compatible with PostgreSQL.

But even then, the other big thing you have to compete on is pricing, right?


Like any othet SaaS product, you're paying for management, which is valuable.


Great to see a come back to old values some of us been taught before the NoSQL trends have gained strength.

One can only hope Google contributes some of their strong consistency mechanisms to open source so that more developers can be swayed towards strong consistency-based architectures. Spanner is great but creates huge vendor lock-in and some businesses will not repeat old mistakes.


Well sure. Consistency is great and for years most DBs people really used gave you all of the ACID guarantees. Then, huge scar came along. At extreme scale, sometimes you have to trade things for speed. But that extreme scale is a problem that frankly most people don't have.

So don't trade off a good attribute unless it's absolutely necessary. Seems sensible


It's really hard to convince people they don't have (and aren't likely to have) a Facebook-sized scale problem though, even if the vast majority of people really don't. It's like trying to convince people they don't have "big data". :-)


This bridges over to another thread (https://news.ycombinator.com/item?id=16156972) about the problem with many IT people thinking of their job as playing with cool technology rather than business outcomes. I’ve run into even a fair number of high-ranked people who habitually over-engineer things because they’ve tied their ego to the wrong metric, and many places didn’t have corrective pressure against that.

Hosted services have been good for counter-pressure: if your oracle cluster can be replaced with a $50/mo RDS instance, it’s a lot harder to hand-wave around the cost differential.


They're "web scale" people.


Well, as a question fairly on topic:

Has anyone successfully used CockroachDB in production?

As it seems to be the only viable alternative to spanner at the moment.


We use it as an insert only db for now for incoming webhooks which allows us to de-duplicate events and significantly reduce writes to the main PG db. We're running it on GKE which has been much easier to setup than PG.

There is also TiDB which is similar in some ways to CRDB: https://pingcap.com/en/


Have you had any major issues with it? It yes, how'd you dealt with them?


No major issues. If you're moving from a single node PG to CRDB you should expect CRDB to be slower because of all the consensus stuff. However almost every update I see from them contains some improvements in that area so it will get better with time probably. You just need to figure our if you're willing to accept a 2ms - 1m (PG) query execution vs 20ms flat execution (CRDB) - even after all the optimizations.


[cockroach here] You can check out stories of cockroachdb production deployments here -- https://www.cockroachlabs.com/customers/


I tried loading this on mobile safari and it looks ridiculous. The last cell of the table is very narrow and the text piles up vertically. For a company like Google the mess they make of how their pages display on mobiles is astonishing.


Global ordering is key, as briefly mentioned. To me, the proven distributed hash table systems like Bigtable, Spanner, HBase, RocksDB have it, and those that don't have it, aren't as strongly consistent (Riak, Cassandra, Mongo, etc..)

Having global ordering lets you reason about where your data needs to go, where it needs to be, and overlay different access patterns to see what will work. Heavy write load? Have more shards. Heavy read load? replicate each shard more. Random read/write? Nothing will get you as close to uniform load distribution as global ordering.


I am always a bit surprised that HBase/Phoenix aren't more widely celebrated in the scale out database scene vs Cassandra, MongoDB, etc but a lot of people designing new systems as users of these DB systems have NFC what they are doing and I guess are swayed by marketing stickers and tshirts? G's BigTable/Spanner if you can afford to outsource..


To be fair: these are three very different kinds of store, designed to meet the needs of very different workloads, and trying to solve very different problem sets.


Not the way I see them used. HBase is quite versatile, Facebook's HydraBase is a good example of doing things people think they need Cassandra for https://code.facebook.com/posts/321111638043166/hydrabase-th.... It's also telling they as the creator abandoned Cassandra, but C* has some very niche conceivable uses (eager replication cache for metadata?). Mongo is just stupid.


I used to agree until I was forced to use an eventually consistent system.

Both have their place, but I think you get better architected, more reliable software with an eventually consistent system.

In and eventually consistent system, the programmer must consider what to do in the case of data failure. I think this is a good thing. When building a user authentication system, who better to decide what to do when you can't access the user database than the person writing the user auth system? Should it fail open or closed? Well if you're a bank, you should probably fail closed. If you're Netflix? Fail open. The programmer should know the business case better than the database, which will alway fail closed.

Sure, in a consistent system the programmer could think about that, but they aren't forced to, which I think is good.

Also, it means that joins must be done in your application. Sometimes its good to have the programmer deal with that too, because it means they will have to really consider what data they need and be much more familiar with how much data is needed to create the join. If you're just relying on the database to do the joins, it might be hiding a lot of load from you, especially if you inadvertently create multiple table scans. In an eventually consistent system, you know if you created a table scan, because you have to pull back a full table's worth of data (or not, because you saw that issue and fixed it).

I just feels like the programmer gets better insight into what is happening to the data and can make better decisions on what to do when they are closer to the business cases the software is being built for.


I think you may have some misconceptions about what consistency means in this context.

> In and eventually consistent system, the programmer must consider what to do in the case of data failure.

Eventually consistency has nothing to do with data failure; the real-world effect of eventually consistency is stale data, which you can't generally tell is stale.

> When building a user authentication system, who better to decide what to do when you can't access the user database than the person writing the user auth system?

I think you're talking about availability, not (eventual) consistency.

> Also, it means that joins must be done in your application.

There is nothing preventing you from doing joins in a strongly consistent database. Google Spanner supports JOINs for example [1].

> In an eventually consistent system, you know if you created a table scan, because you have to pull back a full table's worth of data

This also has nothing to do with eventual consistency.

[1] https://cloud.google.com/spanner/docs/query-syntax#join-hint...


Agree with above, I'm not sure why that comment is most upvoted, it's poorly written and the author doesn't seem to have a good grasp of distributed systems concepts.


Vague "I agree" criticisms make for poor reading. Provide specific counter examples like the parent or don't bother commenting.


I was using "eventually consistent system" as a shorthand for "multi-master database that is eventually consistent" which may have been confusing.


I was using "eventually consistent" as a shorthand for "multi-master highly-available key-value distributed database". Most people who don't work in the space regularly consider them the same thing. Basically Cassandra/DynamoDB vs Cloud Spanner.

> Eventually consistency has nothing to do with data failure; the real-world effect of eventually consistency is stale data, which you can't generally tell is stale.

You're right, that is the highly available distributed part, but it goes along with eventual consistency because if you plan for dealing with stale data then those same concepts protect you from unavailable data.

> I think you're talking about availability, not (eventual) consistency.

Again, the two go hand in hand. Deal with stale data and you've dealt with unavailable data.

> There is nothing preventing you from doing joins in a strongly consistent database.

You misread -- I was saying that with a multi-master key-value system you have to do joins in the app.

> This also has nothing to do with eventual consistency.

Same, I was talking about Dynamo-like systems.


Hey, appreciate the attempt at clearing things up, but I’m afraid your original comment is still quite wrong.

It looks like you’re conflating a well-defined term in distributed systems with properties of particular database implementations that have nothing to do with eventual consistency.

Nothing in your reply is a necessary property of eventually consistent systems, although some eventually consistent systems may be built with those properties. That is quite a huge difference when we’re in the context of the blog post being discussed.

The grayness of your reply should suggest that perhaps you’re misusing the term, rather than everyone else being “confused”.


I agree with the principle that good development practices will try to get the coder to think through the potential failure modes of the code they've written.

Unfortunately, I think you've come to the opposite conclusion in this case. When jedberg writes the code for a system backed by an eventually-consistent data store, he asks "What happens if this value is stale and how can I structure the code to cope with that in a sane and reasonable way?"

But when $Random_Dev writes the code for such a system, he asks some combination of "What does eventually consistent mean?", "What do you mean I won't get back the right data?", and "Isn't that the DBA's problem?"

The core issue being that the average dev will punctuate this line of questioning with things that shift the blame elsewhere; if not the DBA, then "Well, it's good enough for [INSERT_TRENDY_COMPANY_HERE] [ reddit? ;) ], so I think we'll be OK!" or "I don't think that will ever happen to us, our reads get consistent quickly enough". Just whatever will punt the issue down the road.

To be honest, I believe that this impulse is the impetus behind the rise in NoSQL in the first place. Devs resent DBAs and SQL. They, very simply, want a place to shove their data, a place to read it out, and to not know anything else about the in between. I have seen many pitiful architectures that clearly turned on this consideration (and we're seeing it all over again with cloud orchestration).

Good developers will dig in and search out the meaning of the data whether the failure case is handled at the DB level or not.

Highly-structured systems like a traditional RDBMS are very good for resiliency because they require someone to put some definition and formality around the dataset to get basic functionality, and they are generally extremely careful about ensuring the things that come in and out of them are safe and accurate. That's the kind of backing that we really need, especially when there are many developers who can't be trusted to even try to understand more than the bare minimum necessary to get a positive quarterly review.


> Highly-structured systems like a traditional RDBMS are very good for resiliency because they require someone to put some definition and formality around the dataset to get basic functionality, and they are generally extremely careful about ensuring the things that come in and out of them are safe and accurate.

That's not what happens in practice. Just because an RDBMS has transactions doesn't mean a programmer who isn't trying is going to use them correctly. And when you do enforce constraints at the DB level, you end up duplicating all your logic in two different languages, one of which isn't very nice for programming in. E.g. I've never found database timezone handling to be anything other than harmful. The application is just better at enforcing this kind of constraints; a well-designed application could make use of transactionality, but putting the transactionality in the datastore won't make the application better at it if the dev wasn't thinking about it - and if the dev is thinking about it, they can usually do better at the application level.


>Just because an RDBMS has transactions doesn't mean a programmer who isn't trying is going to use them correctly.

Definitely true, but RDBMS provide a lot of safety features beyond transactions, like a fixed schema. Mongo et al let developers stuff anything they want anywhere, creating an abominable bastardization between a key-value store and a database. That's their "killer feature" that has resulted in wide adoption so they can't really do away with it, even if some [likely reluctant] Mongo users continue to find ways to self-enforce a schema.

It'd be one thing if Mongo was predominantly used as a staging ground for free-form, unpredictable data, but I think we all know that's not the case. People pretend like it's a replacement for a traditional RDBMS.

>a well-designed application could make use of transactionality, but putting the transactionality in the datastore won't make the application better at it if the dev wasn't thinking about it - and if the dev is thinking about it, they can usually do better at the application level.

I really disagree here. The semantics and detail involved in _actually_ making sure a datum is securely stored, including a high likelihood of being recoverable in the case of a crash, is very complex in any non-trivial system. It's dismissive and naive to pretend that your average application developer, who generally doesn't even know what `fsync()` is, can do a better job at this than a real database, which will include advanced consistency features like intent logs.


> Mongo et al let developers stuff anything they want anywhere, creating an abominable bastardization between a key-value store and a database. That's their "killer feature" that has resulted in wide adoption so they can't really do away with it, even if some [likely reluctant] Mongo users continue to find ways to self-enforce a schema.

Their real killer feature was proper multi-master clustering - being able to seamlessly add and remove nodes from your storage cluster - in a free product. Traditional RDBMS don't do that.

> The semantics and detail involved in _actually_ making sure a datum is securely stored, including a high likelihood of being recoverable in the case of a crash, is very complex in any non-trivial system. It's dismissive and naive to pretend that your average application developer, who generally doesn't even know what `fsync()` is, can do a better job at this than a real database, which will include advanced consistency features like intent logs.

Intent logs are necessary precisely because the database table model is a poor fit for what actually happens - even databases themselves now spend more effort on converting back and forth to the SQL model than on actual storage or query implementation. A developer with the skills to do it right does need a library of certain capabilities, but the RDBMS is not a good interface for supplying them with those capabilities. See e.g. https://www.confluent.io/blog/turning-the-database-inside-ou... .


Mongo with a strongly typed language also encourages someone to put some definition around the dataset. That someone is just the developer creating the PO*O (Plain Old [C#/Java, etc] object).

In C# for instance using the standard Mongo driver, you're dealing with strongly typed collections and the compiler is keeping you from mistakingly going outside of the "schema".

I don't know how good the other .Net drivers are for other Nosql databases.


Your point is valid, but it makes me sad.

We basically come from opposite viewpoints -- I think most devs will do the right thing, you think most will do the wrong thing.

Honestly, you're probably right and I'm probably wrong, but I like to hope that I will always work in a place where I'm right. :)

(It was Netflix BTW, not reddit. For the most part reddit used strongly consistent databases (Postgres), but we did use some eventual consistency in the caching layer).


I have a slightly different perspective. I think defaulting to NoSQL is simply removing a mature toolset from the developers kit, that they would have a lot of great use cases for.

If you use it in the wrong places, then that almost certainly means increases complexity where you don't necessarily have to. Whether or not that results other positive behaviours seems a bit beside the point to me.

If you managed a team of rock climbers, you could say "I prefer it when my rock climbers don't use safety ropes. It improves their climbing, because if they make a mistake, they'll fall to their death". It could be 100% true, but it doesn't mean you shouldn't simply be picking the right tools for the job.


Yeah, the dichotomy in this conversation is strange.


> I think most devs will do the right thing, you think most will do the wrong thing.

This comes down to culture. I wish I knew how to create a culture where devs do the right thing, from a culture where they ¯\_(ツ)_/¯.

Maybe you need the very clearly Netflix model of "Fire every ¯\_(ツ)_/¯"


Most people don't get consistency right...

Outside of AWS, Google and Azure I see few cloud services that define their consistency models. And support always says: we are high availability and always consistent.

Same goes for distributed software, few projects define their consistency model, and even fewer honor it.


I agree with this 100%, the problem is the eventual consistency model opens up the failure windows. Its the case of the private toll agency here in TX that had a full blown court case/collections agency trying to collect (thousands for literally $2.75 in tolls) from me, while sending information to my 5 year out of date address.

That despite the fact that I changed my address when I moved (says so in the TxDPS record, running on a "legacy" system), which got sent to the toll agency when I changed it. Of course somewhere in there the toll agency lost this information and never bothered to check the TxDPS record until I pointed out they were repeatedly sending notices to the wrong address.

The point being that, this is only one of many problems they have had with their newfangled toll system because sometimes "eventually consistent" actually means "we threw the update away because a newer one came in, so we lost a critical piece of information.". Critical pieces of information like, people actually paying their tolls for example.

http://www.texasturf.org/2012-06-01-03-09-30/latest-news/484... http://story.kxan.com/txtag-troubles/ https://www.reddit.com/r/Austin/comments/78wp9y/txtag_paymen...


I'm not sure I follow your example. Is anyone actually "failing open" for user authentication? Why?

I think I can understand some of your point. Trust should be engineered into the system, not assumed. That is a lot of work and not easy, though. (I think this ultimately lands on the block chain idea of the world. I'm fairly ignorant there.)

Strong consistency just means that you can expect any accesses you performed to be consistent in time with each other, right? To that end, it is nice not having to effectively do a "loopUntilChangeReflected" after every write and checking to see if it is safe to do the next thing. Which you have to do in an eventually consistent world.


> To that end, it is nice not having to effectively do a "loopUntilChangeReflected" after every write and checking to see if it is safe to do the next thing. Which you have to do in an eventually consistent world.

If you have to do this, you probably shouldn't be using an eventually consistent system. Eventually consistent systems should only be deployed when a stale read is harmless. For example, it probably isn't a big deal if something reads "5 upvotes" instead of "7 upvotes" before a change fully propagates out.

Wrapping every write in a "block to ensure consistency" loop is just a lossy, slow, terrible and dangerous way to recapture a tiny amount of the consistency available to those who use a system that suits their needs.

I know that in the real world, we don't often get to make these decisions, or there may be one-off edge cases (though really just put these on a system that's safe for them). But I just want to point out to other readers that if you catch yourself doing this, you just need to use a datastore that fits your application's use case.

Yes, even if you just read an article in TechCrunch about the Hippest Coolest Hottest Big-Dataiest 2.0 3.0 4.0 5.0 Data Storage Project That Makes Google and Facebooks' Mouths Water Hadoop-Daboop Machine, That Only Cool People Use, and No Uncool People Or Trump Supporters Of Any Kind Allowed.

Read that article, and then go use Postgres.


Fair. I was overly silly in my example there.

I'd still argue that even reasoning about all writes in an, "eventually this is true for everyone else," fashion is less than straight forward.

To that end, components should all be able to assume what they do is strongly consistent. If possible, assume everyone else is behind?


Not that I necessarily agree, but what GP is saying is that say you have a distributed DB backing your auth system, and there is a partition. The user tries to log in and you can’t be sure that their account is still valid because of the partition, but because all they are trying to do is watch a movie, just let them in based on the data that the partition you do have access to.

This is in contrast to a bank which should always operate only when there is no partition.


That makes sense, but how do you encode it? The auth system now does not return authorized or not authorized, but authorized or maybe authorized? Seems much easier to just rely on auth being a yes or no. Possibly with an expiration for how long a yes can be trusted.

That is, how is it simpler to assume that the data you get back could be wrong? Now, anyone involved with the system seems to have to know how everyone involved in the system interprets results.


> Now, anyone involved with the system seems to have to know how everyone involved in the system interprets results.

This meshes with the articles assertion: strong consistency leads to fewer bugs because fewer possible states exist at application level.


It allows the person writing the business software layer to make a decision that fits that business use case independent of how others interpret it. But no one else needs to know how it works.

If the "play movie" service calls the auth service and says "should I let this play", the auth service returns "authorized" or didn't give a response or said "maybe authorized" or whatever. The "play movie" service makes the business decision that makes sense for playing a movie, in this case, "go ahead".

But the billing service maybe says "I'm only gonna let you in with a confirmed yes". So it's possible the billing service will give you an error in the same conditions that the movie service does not, but those are business decisions instead of technical ones, which is good.


This doesn't sound like a consistency debate, though. Distributed system, yes. Weak consistency, though?

I'd want strong semantics on, "what is allowed with an authenticated connection with an authentication that is X old?". Preferably, I'd want a somewhat knowable centralized list of the answers. These are not software concerns, but business ones. Right?


> Is anyone actually "failing open" for user authentication?

Netflix did for a while at least, may still do it. Worst case is that someone didn't pay for the month. The partitions happen rarely enough that it is better to give the user a good experience then not let a legit user in because of a network partition.

But the nice part is that this was a business decision, not a database decision.


Billing is not authentication, it's a facet of authorization. In any event, it would be pretty lousy if Netflix exposed account details, viewing history, etc., to loosely-authenticated accounts.


It was only for playing a movie. But that right there is a perfect example of why the logic should be at the business level and not the database level.

Allow them to watch a movie, but don't show billing details.


This is still worded poorly. You aren't skipping authentication for these people. You just aren't doing it again.

In theater parlance, people in the theater are assumed authorized to be there by virtue of the entrance requiring authorization.

Right?

Similarly, this had little to do with consistency. Any client that makes a logout should assume the connections it has are no longer valid. To assume otherwise seems odd. Worse, to encourage relying on it.


This sounds suspicious. They had other mechanics to guard against abuse than auth?

Worst case would be a renegade client that constantly used the open door, no?


I think this is a backend thing, so renegade client isn't an issue. The choice is: if the auth system was down, should the video system prevent everyone from seeing video, or let everyone in. If auth is down, some people will be able to watch when the shouldn't be able to, but only for the period of time while the auth system is down.


Wouldn't this only allow recently authenticated people in? That is, connections at this point can check if there is an "authenticated" flag set. Assuming any basic expiration, the only people getting through this are those that have recently lost authorization. They were still authenticated. Right?


If auth is up, everything works as normal. If it's down let everyone in.

I'm envisioning something on the backend equivalent to:

OnGetVideoChunkRequest: if Authorized(User) or IsDown(AuthSystem) then SendVideoChunk


Makes sense. Still surprises me that it wouldn't be a check more like "if not expiredOrAbsentAuthentication..."


I think you're overly focusing on details. Authorized(User) might check expiration or other things, but that isn't really relevant to the fail open vs fail close idea being discussed.


I'm focused on connections. All systems should not have to connect back to the auth system to do their job. To that end, determining that a connection is trusted is not a weak versus strong consistency issue.

Now, checking a password in the system should be strongly consistent. In that I should be able to update my password and immediately use the updated password. Any other behavior will surprise me.


I wouldn't take my example too literally. The example was just to illustrate the basic logic for fail open and intentionally wasn't complete or detailed.

For purposes of discussing fail open, how that logic gets translated during implementation doesn't matter. Authorized(User) might be expiring tokens passed in via cookie. IsDown(AuthSystem) could just check some global flag that gets set by some watchdog service on a AuthSystem heartbeat. It could be something else. It doesn't matter because it's not relevant.

Having a direct connection to the AuthSystem or not is an implementation detail, not something that changes the idea "Netflix should fail open".


I suspect it is just focusing on an example. So, to that, apologies if I am over diluting the conversation. Rereading my first post, I should have been clearer that I can agree with the point. Just don't know if auth is the domain I would pick, since it isn't a persisted thing, but a vouched one.

Similarly, anywhere you are doing idempotent guarantees, strong consistency is vital. Same for customer interaction. Reads following writes that a user made should almost certainly strive for strong consistency. If only to avoid confused customers.

Writes the system makes, though? Expect those to just fail in ways that will leave things inconsistent and constantly check for them.


It would only fail open if the connection from the auth service to the database was down. It's all server side.


See my sibling response. This doesn't sounds like a consistency debate as much as a trusted connections debate.


I don't think you can just set a line like that. Eventually the programmer might want to consider what to do in the case of data failure. Otherwise, why even write shared code? The programmer might be wise to write all their software from scratch, affording every consideration where they know best, but they certainly won't want to start there. And this says nothing for teams where people leave and enter over time and persistent experience is ephemeral.


This is a fair point. Maybe the engineer doesn't think about it, but at least some engineer wrote a library that takes care of it, instead of relying on the database to do it. And even with a consistent system, maybe they are still using a library.

But my point was eventually consistent systems, in general, force you to think about these things earlier on. Of course there are exceptions in both directions.


Wow, I so strongly disagree with this. I'm currently working on a system backed by DocumentDB, and had a solution for the core of our application written out in a 13-line javascript stored procedure. The edict came down from above: no stored procedures because scalability. No analysis, no real-world data (we're currently doing a few transactions per day; my understanding is that stackoverflow uses SQL server and scales just fine), just idealism. So we set off to push our stored proc logic into a lazily-consistent background job. I hated the idea initially, but the result turned me off of eventual consistency permanently.

This is because, not only did we develop an edict against stored procedures, but we then developed an edict that everything must be done using the lowest consistency model possible. We ended up settling on session consistency. TBH we discovered an algorithm that didn't even require session consistency, but the latency would have been to great. Again, no particular analysis or rationale for the edict, just idealism. The end result was a system that, for each update, required six background jobs to be launched, which each launched their own cascading jobs that patched up the data. And our records all had to be peppered with version fields and updated-by-job fields and version-of-foreign-key fields that these fixup jobs had to test against to make sure they're not operating against stale data or running over each other, such that this crap actually outweighs the actual data in the records. And these background jobs manage state in Table Storage and transfer messages via Service Bus, so two more SPOFs not to mention all the extra compute resource cost. It's a huge mess of a jenga puzzle that's taken months to implement and code review and test, and now even six months later we're still finding edge cases that aren't handled correctly (or are they? you have to constantly go through from the start to really understand it). In addition, since things are never guaranteed to be in a consistent state, that means we could have foreign keys that don't reference anything, or circular graphs where cycles should be disallowed, or whatever else at any moment in time, and so we have to fudge this stuff when presenting data to the user, and it reduces our ability to do any sort of meaningful statistics on our data. To remove 13 lines of javascript. It's just a mess.

And then what happened is a couple new requirements came in that couldn't be reasonably handled in a lazy fashion, so we ended up using stored procs for that logic anyway.

Now, to be clear, the resulting system is strictly more scalable than the stored proc based one. One could in theory just throw more machines at any load and it'll continue to work. So there's that. But a) the load beyond which the stored proc system could scale is about 3 orders of magnitude higher than even our optimistic usage estimates in the foreseeable future, and b) the lazy algorithm requires so much additional compute overhead that, while infinitely scalable, would we even be able to afford it anymore?

So for my money, give me a strongly consistent system by default. I don't even particularly understand the argument given above. With a strongly consistent system, you can still choose to do things like join in the client if there's a reason for it. But going the opposite way and forcing consistency to be unassumed everywhere produces the mess we're dwelling in now.

Now, certainly there are cases where eventual consistency is better: high-write-volume on not-very-relational data where point-in-time accuracy isn't super business-critical is where it typically shines. At the "edges" of your system, essentially. But I do want to point out some of the pain that can be encountered if it's chosen for the wrong use case, and make sure that anyone who does use it in their core application logic knows what they're in for, and puts some more thought into whether it's what they really want.


> The end result was a system that, for each update, required six background jobs to be launched, which each launched their own cascading jobs that patched up the data. And our records all had to be peppered with version fields and updated-by-job fields and version-of-foreign-key fields that these fixup jobs had to test against to make sure they're not operating against stale data or running over each other, such that this crap actually outweighs the actual data in the records.

That sounds bad. The way you're supposed to do it is have the initial data be immutable and compute the downstream data separately, but into its own places.

> So for my money, give me a strongly consistent system by default. I don't even particularly understand the argument given above. With a strongly consistent system, you can still choose to do things like join in the client if there's a reason for it. But going the opposite way and forcing consistency to be unassumed everywhere produces the mess we're dwelling in now.

I'd say transactions should never be the default, because a database transaction that isn't a semantic transaction is worse than useless. If the application programmer hasn't thought about transactionality and synchronization, it's much better for that to show up as not having any transactions - you can fix that, by adding transactions where you need them - than to have transactions that don't mean anything, and the data ending up inconsistent even though no transactions ever overlapped.


It sounds like your root problem was dogmatic thinking, not any sort of consistency model.

There is a time and place for eventual consistency and multi-master, and there is a time for strongly consistent SQL servers.


> "Cloud Spanner, in particular, provides external consistency, which provides all the benefits of strong consistency plus serializability."

What Google call 'external consistency' is what's known by relational theory people as 'isolated serializability'.


So the gist is that they can get better transaction throughput because they serialize transactions with a distributed clock rather than by locking? Isn't there also usually a query latency trade-off with higher consistency?


This post would have benefitted from either explaining what Cloud Spanner is or linking to such an explanation in the first paragraph.



That depends on their target audience actually. It’s not objective like you phrased it.


This article is very resourceful (I'll be returning to it again and again) but no no no, its opinion is fundamentally flawed.

Ignoring the fact they obviously have a strongly-consistent product to sell, which therefore informs their view, the flaw is that the article is the cliche "I have a hammer therefore everything is a nail".

Honestly, if you go back to the beginning of the article, their first quote sums it up "better to have programmers deal with performance problems ... as bottlenecks arise" admits to the fact that this is the inevitable outcome of strongly consistent systems!

If physics is against it, you're asking for trouble. I've given tech talks around the world on this, just don't pick a fight against physics or else YOU WILL LOSE FAITH in any/all database systems you use: http://gun.js.org/distributed/matters.html .

That is what these strongly consistent database systems run off of, faith (religion) that the database is their god that will solve all their problems. You only need to take a look at http://jepsen.io/ to be proven that basically every single system out there (except for like RethinkDB and one or two others) fail miserably at being strongly consistent even when they are suppose to be.

But hey, they need an excuse to get people to buy into Cloud Spanner - and nothing sounds better than "Hey we're Google, let us manage your data for you, because you aren't smart enough to deal with consistency."


I don't believe it's religion - it's experience. Google, like Amazon, initially started building scale-out systems by sacrificing strong consistency and transactions. Amazon did it with Dynamo, and Google did it with BigTable. Over time - and with very substantial engineering investment - Google has started walking that back, first with Megastore, and now with Spanner.

What we're seeing happen is a reinvention of a lot of classical transactional systems from a "massive scale-first" perspective instead of a "local transactions on spinning disk fast" perspective. The eventually and causally-consistent systems have something to add, but I don't think it's wise to discount Google's years of engineering experience in this as religion. Rather, it's reflective of starting to operate at a number of engineers scale at which it has become worthwhile investing very dedicated engineering and research effort into supporting a transactional model, at scale, so that thousands of other engineers can be more productive.

There's a chance this experience doesn't generalize -- recall that Google has massive amounts of dedicated inter-datacenter bandwidth with good management and redundancy, the ability to hire dedicated teams of database & distributed systems experts, etc. But it probably generalizes to the other giant companies in the field -- Microsoft, Amazon, Facebook, and Baidu have similarly huge numbers of "non-distributed-systems-expert developers" who need to get stuff done, correctly, and the ability to invest heavily in the infrastructure to make them more productive.

Another way to summarize it is: At some point, you're going to have to fix problems at the application and algorithm level, and not just hope that the underlying storage system makes everything magic. It's easier when those problems are performance problems than when they're correctness/consistency problems.

(Disclaimer: This is my view as a distributed systems professor, not a one-day-per-week Google employee, but I do have that second hat around.)


> You only need to take a look at http://jepsen.io/ to be proven that basically every single system out there (except for like RethinkDB and one or two others) fail miserably at being strongly consistent even when they are suppose to be.

What a terrible assertion. Most of the systems that Jepsen tests are AP systems. Basically all of the systems that are supposed to be CP (e.g. etcd [1], Zookeeper [2], CockroachDB [3]) have been shown to be more-or-less fine at what they are supposed to do.

[1] https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul

[2] https://aphyr.com/posts/291-call-me-maybe-zookeeper

[3] http://jepsen.io/analyses/cockroachdb-beta-20160829




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: