TLDR: FaunaDB is a distributed database that offers multi-partition ACID transac...

jatsign · on March 5, 2019

Sorry, I'm not trying to be snarky, but that sounds like "it's good for everything" and doesn't help me determine when I should look at Fauna and when I shouldn't.

Perhaps alternately - when is Fauna NOT a good choice?

aphyr · on March 5, 2019

A few other points to consider!

FaunaDB is a unique database: its architecture offers linear scalability for transaction throughput (limited, of course, by contention on common records and predicates). That sets it aside from databases which use a single coordinator for all, or just cross-shard, transactions, like Datomic and VoltDB, respectively.

It's also intended for geographic replication, which is a scenario many databases don't try to handle with the same degree of transactional safety--lots of folks support, say, snapshot isolation or linearizability inside one DC, but between DCs all bets are off.

FaunaDB also does not depend on clocks for safety, which sets it aside from other georeplicated databases that assume clocks are well-synchronized. Theoretically, relying on clocks can make you faster, so those kinds of systems might outperform Fauna--at the potential cost of safety issues when clocks misbehave. In practice, there are lots of factors that affect performance, and it's easy to introduce extra round trips into a protocol which could theoretically be faster. There's a lot of room for optimization! I can't speak very well to performance numbers, because Jepsen isn't designed as a performance benchmark, and its workloads are intentionally pathological, with lots of contention.

One of the things you lose with FaunaDB's architecture is interactive transactions--as in VoltDB, you submit transactions all at once. That means FaunaDB can make some optimizations that interactive session transactions can't do! But it also means you have to fit your transactional logic into FaunaDB's query language. The query language is (IMO) really expressive, but if you needed to do, e.g. a complex string-munging operation inside a transaction to decide what to do, it might not be expressible in a single FaunaDB transaction; you might have to say, read data, make the decision in your application, then execute a second CaS transaction to update state.

FaunaDB's not totally available--it uses unanimous agreement over majority quorums, which means some kinds of crashes or partitions can pause progress. If you're looking for a super low latency CRDT-style database where every node can commit even if totally isolated, it's not the right fit. It is a good fit if you need snapshot isolation to strict serializability, which are strong consistency models!

justinsaccount · on March 5, 2019

> If you're looking for a super low latency CRDT-style database where every node can commit even if totally isolated, it's not the right fit.

What is actually a good for this lately?

aphyr · on March 5, 2019

Gosh, I wish I had a good answer. Riak (and Riak-DT) were the best I knew of in this regard, but Basho folded and I haven't kept track of the level of community/corporate support available. Not sure where Riak stands these days. There are a bunch of prototype databases experimenting with CRDTs and other eventually-consistent mechanisms, but I'm not sure which ones have garnered significant traction in the industry yet.

_russelldb · on March 6, 2019

It's extremely rewarding to read what you've written there, thank you!

It's true. The community is keeping Riak alive, and pushing it forward. The CRDT stuff has not been worked on for a long time, despite there being numerous tabled improvements in that area SINCE 2014. It was neglected by the "new" management that crashed Basho into the ground as being "complicated computer science nonsense". Since then I've been unable to do the CRDT work for Riak I wanted, due to lack of time, and no one willing to pay for it (despite numerous bait-and-switch offers for completing the bigsets work.)

The CRDT work in riak needs: 1. the smaller/more efficient map merging 2. some new types from Sam merging (range reg etc) 3. the bigsets work completing and integrating (this enables delta-CRDTs) 4. big maps (built on the above work)

When it has all that it would be at the place I'd envisioned for it, before everything went super bad at Basho. I work at Joyent now, so can't dedicate any time to taking the CRDT work forward, though I really wish I could. I still have a deep interest in the EC datatypes world.

IMO, at the time it was released (2014) Riak had ground breaking support for convergent datatypes, since then it has only lost ground. Am I bitter? Yes, a little.

WRT other systems with CRDT support, https://www.antidotedb.eu/ came out of the syncfree research project. Chistopher Meiklejohn's work is industrial grade, state of the art, actively developed, and highly recommended. https://github.com/lasp-lang As others have mentioned, REDIS also uses CRDTs for MDC replication, and has CRDT co-inventor Carlos Baquero on the tech board.

NickM · on March 6, 2019

Riak still seems to have a decently active community; all of the formerly commercial-only features have been open sourced, and there have been multiple community releases since the demise of Basho, with another one coming out soon. I've seen a fair amount of activity on the riak-users mailing list lately too, and there is also paid support available via a couple different companies last I checked.

ddorian43 · on March 6, 2019

Redis labs has crdt support on top of redis

dsergeyev · on March 5, 2019

Hey I am curious what does it use in place clocks for safety and quorum? Is there a paper

aphyr · on March 5, 2019

Yes, there IS a paper! Check out Calvin: Fast Distributed Transactions for Partitioned Database Systems. FaunaDB uses Calvin with some changes to allow more types of transactions, and to improve fault tolerance. :)

http://cs.yale.edu/homes/thomson/publications/calvin-sigmod1...

evanweaver · on March 5, 2019

See our blog post as well: https://fauna.com/blog/consistency-without-clocks-faunadb-tr...

evanweaver · on March 5, 2019

Well, it is a general-purpose database with a document-relational programming model. It's a good choice when correctness, high availability, and flexible data modeling matter, like the core business objects of a service or app.

Currently it supports its own native DSL similar to LINQ or an ORM, and if you want compatibility with existing query languages from other databases, we will be rolling those out over time.

It's not a good choice for analytics or time series data which is redundant or aggregated, and doesn't need the high availability or performance overhead of transactional isolation.

0xbadcafebee · on March 5, 2019

I'm not trying to be snarky either, but your question was a bit like "What's the use case for pants?" The answer is, a lot of things. We would need to know what you're doing to tell you what it's good for, what kind to choose, and why.

zzzcpan · on March 5, 2019

In general consensus based systems are not good when latency and performance actually matter and in real world WAN setups, where connectivity issues are pretty much constant. These are fundamental limitations they can't fix.

freels · on March 5, 2019

Unless you have specifics, I’m not sure this comment stands up to reality. Modern networks are pretty reliable in my experience. The state of the art in consensus reduces learning to one round trip. Calvin further eliminates all but one global RT in distributed transaction commits.

Additionally, FaunaDB gives you the tools to work within a global environment with _safe tradeoffs_. For example, reads at SI/Serializable can be fast and uncoordinated. You choose per-query.

zzzcpan · on March 5, 2019

Well, networks are not that reliable. But you don't have to believe me, there is enough of public information about real world operations. Take Aurora paper for example, where they can't even do 2/3 quorum, but do 4/6 quorum instead and that's between datacenters they completely control with connectivity they more or less control.

aphyr · on March 5, 2019

You're not wrong: networks in general aren't super reliable, and partitions are a real problem! Peter Bailis and I wrote a paper on this in ACM Queue. I've spent much of my professional life exploring the consequences of network failure on distributed systems.

That said, I don't think it's reasonable to infer that because failures occur more often than we'd like, systems based on consensus are always inappropriate for latency and performance sensitive systems. While there are minimum bounds on latency in consensus systems, there are also plenty of human-scale problems for which that bound isn't a dealbreaker. Moreover, some types of computation (e.g. updates in sequentially consistent or snapshot isolated systems) simply can't be performed without paying the same costs as consensus. Consensus can be an appropriate, and in some cases the only possible solution, to some problems.

zzzcpan · on March 5, 2019

I didn't infer they are always inappropriate, just that they are not good for latency and performance sensitive systems, i.e. they shouldn't be considered unless absolutely necessary.

northstar702 · on March 5, 2019

Would you give some more insight into your opinion? What latency/performance are you targeting -- have you tried using a system like this to arrive at your conclusion?

freels · on March 5, 2019

Just because failure can happen doesn't mean it's frequent. For the record AWS does this to protect against correlated failure. At their scale, they assume they have some percentage of local nodes down at any given time, so have designed the system to tolerate both network partitions and local failure at the same time.

zzzcpan · on March 5, 2019

They did it for stable performance, because failures are in fact frequent. Things are much worse once you go out into the public internet, where different hosting providers communicate over all kinds of networks with all kinds of issues.

evanweaver · on March 5, 2019

FaunaDB can tolerate the loss or partition of a minority of replicas (datacenters, AZs, or racks) in a cluster, so if you want to tolerate more concurrent failures, just add more replicas.

For example, a 7 replica FaunaDB cluster can lose 3 replicas and maintain availability for reads and writes; better than Aurora in your example.

twic · on March 5, 2019

Would it be fair to say that if i am considering Gemfire/Geode, i should also consider Fauna?

evanweaver · on March 5, 2019

Yes, for sure. They are both distributed object stores and share inspiration from the Linda/tuplespace lineage, as well as relational and document databases in FaunaDB's case.

FaunaDB supports function execution in the database, if the functions are written in FQL. It has a much more powerful index system and consistency model than Gemstone/Gemfire/Geode and is designed for synchronous global replication.

However, unlike Geode it is not an in-memory system, so it is not appropriate to use as a cache.

metatype · on March 11, 2019

As noted in the writeup, the FaunaDB transaction model uses a unique query expression language. Geode applies a more familiar begin + <operations> + commit / abort approach.

Geode skews towards eventual consistency when connecting geographically dispersed clusters. This means you still get super fast local transactions with batched updates to remote sites.