It's definitely true that putting the burden of consistency on developers (inste...

davidgay · on Sept 21, 2018

It's worth noting that Cloud Datastore's follow-on, Cloud Firestore, does provide strong consistency, and includes a "Datastore mode" that supports the Datastore API.

Firestore is currently in beta, but once it's GA we will be migrating all Datastore users to Firestore: https://cloud.google.com/datastore/docs/upgrade-to-firestore

Disclaimer: I work on Cloud Datastore/Firestore.

stingraycharles · on Sept 21, 2018

I always got the impression that Firestore is more aimed towards mobile apps rather than backend applications, and as such more or less a different kind of product. Is this not the case?

davidgay · on Sept 21, 2018

Datastore was initially designed for use from App Engine, i.e., an easy start, no management, automatic scaling environment ("serverless" to use the current in-vogue term).

I would view the Firestore API as a further extension (the "Datastore Mode" functionality was always an element of the design) of that paradigm, extending to the case where you have no trusted piece of code to mediate requests to the database, thus allowing direct use from, e.g., mobile apps (at which point other issues such as disconnected operation surface).

So not so much a "different kind of product" and more a product that supports a strict superset of use cases.

thesandlord · on Sept 21, 2018

While Firestore has a Android/iOS/Web SDK, it also has great backend support (Python, Java, Node, Go, Ruby, C#, PHP) as well. The "realtime" features of Firestore are better suited for mobile IMO, but using Firestore as a scalable, consistent, document/nosql database for your backend is definetly a good use for it.

I actually think most of the server SDKs don't even expose many of the realtime APIs. Maybe they will in the future, but it shows that you can use Firestore like a normal database just fine.

(I work for GCP)

wikibob · on Sept 21, 2018

I’m guessing that Nomulus is your project?

I ran across it before and just wanted to say it’s really cool that this is open sourced.

CydeWeys · on Sept 21, 2018

I'm the tech lead of it. Glad to hear that you've heard of it before, and yeah, I think it's pretty cool it's open sourced too, which is why I made it happen! https://opensource.googleblog.com/2016/10/introducing-nomulu...

guillon · on Sept 23, 2018

Any idea if any company is planning to create an offer dedicates To .BRAND new gTLDs?

erikb · on Sept 21, 2018

Why does anybody need to provide consistency? Often you don't have complete consistency anyways. There are bugs, there are time delays, there is parallel processing. Why even have the requirement?

CydeWeys · on Sept 21, 2018

Let me expand on an example issue that comes up.

Nomulus is software that runs a domain name registry, including most notably the .app TLD. There are three fundamental objects at play here; the domains themselves, contacts (ownership information that goes into WHOIS), and hosts (nameserver information that feeds into DNS). There's a many-to-many relationship here, in that contacts and hosts can be reused over an arbitrarily large number of domains.

The problem is that you can't perform transactions over an arbitrarily large number of different objects in Cloud Datastore; you're limited to enlisting a maximum of 25 entity groups. This means that you can't perform operations that are strongly consistent when contacts or hosts are reused too often. This situation comes up a lot; registrars tend to reuse the same nameservers across many domains, as well as the contacts used for privacy/proxy services.

These problems don't arise in a relational SQL database, because you can simply JOIN the relevant tables together (provided you have the correct indexes set up) and then perform your operations in a strongly consistent manner. That trades off scalability for consistency though, whereas in Spanner you give up neither.

erikb · on Sept 21, 2018

I don't see how this relates to the question whether consistency is needed. If two tables only have 80% of all the rows that you expect you can STILL do a join on them. It's just that the join, just like your original data, is not containing all data sets. The join itself will not raise an exception because of that.

CydeWeys · on Sept 21, 2018

Strong consistency is required because you cannot delete/rename a host or contact if it is in use at all. Hence the requirement for strong consistency. It's not good enough to say that you can go ahead with the operation because it's not used by at least 80% of domains; you need to know that it's not in used by any domains.

zzzcpan · on Sept 21, 2018

Well, it's pretty hard to even find applications that require strong global consistency and are willing to sacrifice latency for that. Typically apps don't need much consistency at all and can sacrifice some data instead, like with most RDBMS setups in the wild. Beyond that SEC (strong eventual consistency) covers pretty much all consistency needs there are.

erikb · on Sept 21, 2018

> it's pretty hard to even find applications that require strong global consistency

Exactly my point, even a little generalized. The world isn't consistent, why always put unrealistic constraints unto ourselves. If we change from "I really need to have all the datasets in all of their truthest form" and instead go with "well there is some data coming, better than nothing" the whole system might be even more reliable. Things in the middle won't die just because the world is imperfect.