This is a really big question that quickly gets into distributed systems theory ...

Klathmon · on Aug 12, 2016

I guess the part I have trouble with is how do you separate what can't be delayed from what can?

Is it just a matter of keeping the information on 2 separate systems, or are there tools that let you for instance mark one query as "take your time and make sure everything is 100% up to date before committing this" and another as "return whatever you got, it's not important"?

pas · on Aug 12, 2016

Cassandra, MongoDB and Redis can do that to some degree.

http://cassandra.apache.org/doc/latest/architecture/dynamo.h...

http://hector-client.github.io/hector/build/html/content/con...

Tunable durability, and some minimal isolation (atomicity) guarantees for Mongo: https://docs.mongodb.com/manual/reference/write-concern/ and https://docs.mongodb.com/manual/core/read-isolation-consiste...

http://redis.io/commands/WAIT

I guess something like this can and will eventually find its way into any practical and effective distributed data store.

dberg · on Aug 12, 2016

One solution is to cookie persist the user to the primary site and have a TTL to allow for some replication lag. Facebook does something similar here.

gchpaco · on Aug 12, 2016

Depends on the database, but Cassandra, for example, has quorum mode for writes, which requires a majority of the cluster members ack the write. This can be enabled on a per-query basis, and also for reads.

The other way of doing it is things like CRDTs (https://en.wikipedia.org/wiki/Conflict-free_replicated_data_...) which have a join operation for any two data values.

You have to keep it in the back of your mind that it's a thing, but working without consistency can be done.

im_down_w_otp · on Aug 13, 2016

Quorum and CRDTs deal with completely different problems.

CRDTs do one thing... they mitigate the issue of "lost updates". All acknowledged writes will be represented in the results of a query and no "winner" strategy is involved that would cause some acknowledged writes to be possibly incorrectly dominated by others and thus lost.

Quorum (strict) just provides a very, very, very weak form of consistency in the case of concurrent readers/writers (RYW) and just very, very weak consistency in the case of serialized readers/writers (RR).

My personal opinion is that any eventually consistent distributed database that doesn't have built-in CRDTs, or the necessary facilities to build application-level CRDTs, is a fundamentally dangerous and broken database because unless all you ever write to it is immutable or idempotent data, you're going to have your database silently dropping data during operation.

trollied · on Aug 12, 2016

... and the problem is that the new kids think all of this is acceptable in worlds where it really isn't acceptable