Hacker News new | past | comments | ask | show | jobs | submit login
Applied Monotonicity: A Brief History of CRDTs in Riak (christophermeiklejohn.com)
118 points by cmeiklejohn on March 8, 2019 | hide | past | favorite | 16 comments



Just have to plug the super-practical and awesome implementation of CRDT's in Elixir's Phoenix Framework used for pub_sub and in the Presence feature: https://github.com/phoenixframework/phoenix_pubsub/blob/mast...


CRDTs are great, and I've seen a lot of quality in Christopher's work.

A few other resources for those interested in CRDT subject:

- Our cartoon explainer https://gun.eco/distributed/matters.html

- An excellent talk by Martin Kleppmann https://youtu.be/yCcWpzY8dIA

- Marc Shapiro's in depth talk https://www.youtube.com/watch?v=oyUHd894w18


Nice to see more about CRDTs and Riak. Worked a bit with Riak a few years back and was sure at the time that CRDTs would play a much larger role in distributed databases, but have seen very little about them since.

Anyone have any more resources on the adoption/use of CRDTs?


CRDTs are quite actively discussed in the IPFS community, in particular here: https://github.com/ipfs/research-CRDT/ (lots of great resources if you're interested in CRDTs)

Also check out automerge, a CRDT that aims to be as JSON-like as possible: https://github.com/automerge/automerge

Martin Kleppmann (one of the creators of Automerge) has given several great in-depth talks about it, eg. https://www.youtube.com/watch?v=yCcWpzY8dIA


I was not able to actually find the Javascript implementation of the data type in that repo. I have tried to implement from first principles (i.e. the paper) in my language, but large parts seem to be left under- or un-specified. Did I miss something? I would love to be wrong about this...


I haven't dug deep into the implementation, but most of the interesting CRDT stuff seems to be in the "backend" directory in that repo: https://github.com/automerge/automerge/tree/master/backend

In particular the "operations" defined in [1] section 4.2 are defined in this file: https://github.com/automerge/automerge/blob/master/backend/o...

[1] https://www.cl.cam.ac.uk/~arb33/papers/KleppmannBeresford-CR...


CRDTs are eventually consistent and most databases have been trending towards strong consistency.

Redis Enterprise is probably the biggest CRDT usage in databases, used to sync geo-replicated active-active clusters: https://redislabs.com/landing/active-active/

There's also Datanet/Kuhiro which was a novel fully distributed database using CRDTs to always merge state across disruptions to a global view, although now seemingly dysfunct: http://highscalability.com/blog/2016/10/17/datanet-a-new-crd...


Luckily for CRDTs, consensus-based strong consistency doesn't work well for most applications in geo-replicated scenarios, so CRDTs are kind of inevitable there. And they are still new and cutting edge, still need time. Look at the history of Riak, we wouldn't even use the same designs today as they did just a few years ago.


Ultimately though it's pretty straightforward to write an accurate merge function for most distributed applications with the pro that you can think about them in a locally strong consistent manner. While I could see a better ORM style auto-merge system existing I don't see the hassle of CRDTs being worthwhile when the vast majority of geo-replicated systems have extremely small numbers of record level write conflicts across Geo's.

I'd venture that the number of potential write conflicts introduced by FB users hopping geo's is small enough to simply report an error to the client and forcing a retry - or handled by a hand-shake before the customer crosses GEOs

Is there any practical database application where a CRDT is useful?


The practical application would be the example I posted about Redis replication.

Any application complex enough to need geo-replication is also likely not just writing a single key per user. There will be shared keys and data structures like tables/lists/hashes that all need to be synced.


Correct me if I am wrong, but don't FaunaDB and CockroachDb implement strong consistency? I don't think they use CRDTs but maybe I'm wrong. I'm just a curious guy, by any means an expert


The Wikipedia page, which some of us help to maintain, has a list of industry adopters.

https://en.wikipedia.org/wiki/Conflict-free_replicated_data_...


Phoenix (the elixir web framework) uses simple CRDTs in it's presence API.


This is mentioned in this article, as the design used in Phoenix came from Riak.


Research : Implementation : Writing style :

I think it's really worth mentioning how much work it takes to write this kind of article. Congratulations Chris, I think that on top being a great researcher, you are also a great writer. And a quick look at the Lasp repositories shows that you did not only write about CRDTs' history, you made a big part of it happen as well.

Meiklesoft™ ftw ️


> Research : Implementation : Writing style :

There should have been stars there, what I meant was :

Research : * * * * * Implementation :* * * * * Writing style : * * * * *




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: