I agree broadly with the article’s position but I think locks are more harmful than helpful. When I was a Quip user (2018) it was super frustrating to get locked out of a paragraph because someone’s cursor idled there. Instead just allow LWW overwrites. If users have contention and your sync & presence is fast, they’ll figure it out pretty quick, and at most lose 1-2 keystrokes, or one drag gesture, or one color pick.
Notion is “collaborative” and we don’t use a CRDT for text, it’s all last-write-wins decided by the server. However our LWW texts are individually small - one block/paragraph in size - and adding/moving/removing blocks is intention-preserving if not perfectly convergent.
As the article says, the downside for LWW is that “offline” / async collaboration isn’t so great. That’s why we’re working on switching to CRDT for our texts. If you’re interested in bringing CRDTs to a product with a lot of users, consider joining Notion’s Docs team - https://boards.greenhouse.io/notion/jobs/5602426003 / @jitl on Twitter / jake@makenotion.com
So Last-Write-Wins (LWW) basically _is_ a CRDT, but not in the sense that anyone really expects, because they aren't that useful or intention preserving. Especially if the two writes happen in very quick succession / concurrently.
LWW becomes useful if you can:
a) help humans to see who is doing what on a doc
b) reduce the size of the change that is LWW
As you've said:
> However our LWW texts are individually small - one block/paragraph in size
This is really important, because by reducing the size of the individual element that any single user is updating you can also reduce the chance of conflicts. With a small amount of contextual feedback (like the notion icon next to the block) a lot of the problem cases are just avoided.
Clearly locking and updating the entire document would be terrible, but if you can do it on a small enough scope that others can change other elements, it can work really well.
(If you've worked on the notion things you're describing then I'm sure you know this better than I do, but just spelling it out really clearly.)
You can have a LWW CRDT, but not every LWW is a CRDT. LWW CRDTs generally pick a winner based on causal order which is convergent, the C in CRDT, because every peer receiving the same ops in any order would pick the same winner.
Picking a winner based on wall clock time order (as suggested in the article, and implemented by Notion) is not convergent; if peers used that algorithm to apply ops I they would not converge to the same state. Instead you need an authority (your server) to essentially pick an arbitrary order of ops and clients just have to deal with it.
A practical example is to consider three users (A, B, C) working on a document.
1. A, B, C start online working together.
2. C goes offline
3. A, B stay online, and make 100s of changes.
4. C makes a few changes while offline (not sent to other peers yet)
5. C comes back online and sends their changes to (the server / other peers).
In wall-clock LWW, C's changes will overwrite A & B's changes, even though A & B have done a lot more work.
In a causal ordering CRDT implementing LWW, C's changes will "lose" to A & B's changes, since they were actually made "earlier" in causal ordering.
> Clearly locking and updating the entire document would be terrible, but if you can do it on a small enough scope that others can change other elements, it can work really well.
I'm sure good UX is possible with locks, but I haven't used one yet for document editing. I'm still traumatized from Quip which did per-paragraph (per block?) locking and was really annoying. Eventually they added an unlock button next to the block so anyone could "steal" the lock at will due to all the user complaints, but at that point, why put it in at all?
I understand what you are saying here in terms of the difference between using wall-clock or causal ordering to determine who 'wins' for LWW. However, both of these strategies seem convergent to me? In any case, all clients will agree on whose changes win.
1. With wall-clock decided by clients, A + B changes will win since C's wall-time is earlier (yes, C could lie, but still would converge).
2. With wall-clock decided by server C will win and everyone will agree.
3. With causal ordering, everyone will agree that A + B won.
2 is not a CRDT since it requires a central server, but I think 1 would still count? Or stated another way: I'm not sure the _convergence_ is what determines if these strategies are CRDTs or not, but rather whether or not this decision making is _distributed_ or not.
You could add to the confusion by using C=Collaborative. (I personally prefer "collaborative data structures" for the more general concept of data structures that can be edited on multiple devices; CRDT is a mouthful and now also a buzzword.)
I feel the same in the paragraph. But in some situation it seems rational
Figma still limits you from editing the component if someone is on it. And Figjam did that too. In my mind, this is a good practice in the realm of collaborative design. Because it will be very messy if a single component obeys the last-write-win rule
Are you considering having a CRDT for each text block individually, or moving to a CRDT for the entire data model for a document? Really curious about the design approach here, especially insofar as there's now an external API that the data models need to service!
It’s Complicated (tm), a hybrid of the two. Text content across multiple CRDT enabled blocks may be connected in a single logical CRDT, but the segment visible in any given block is stored in that block object, which maintains our permission system invariants. We’ll still support moving blocks to different pages, so one page may have a few different CRDTs visually interleaved.
All the edge cases are very interesting, like what happens if I use backspace to join a block of CRDT B into block of CRDT A.
API consumers are unaffected. Likely we’ll support API updates of text as best-effort diff-match-patch of the text on our side applied as CRDT ops.
> everyone’s gonna say “but hey, google docs uses operational transform not CRDTs”.. OK yes, but you are not google.
Well, google docs works not because they somehow figured out how to converge OT edits with as much precision as CRDTs do, but simply because they have a central server which orders edits anyway and don't need true leader-less convergence.
In fact, I agree not many things don't need a CRDT. CRDTs help with mathematical rigidity of convergence when you want true peer-2-peer collaboration which works without any central authority.
However, most apps anyway work on top of a central authority (example SaaS apps) so there is no real reason to accomodate all the compexity that comes with CRDT. They might get far with a simpler OT based model + central server based ordering.
For example even Figma doesn't call its model a 100% pure CRDT. It's a partial, simpler CRDT implemented with an assumption that there's going to be a server that understands ordering.
It's the same with Google Docs. They don't need a CRDT because it's a cloud app after all, which means OT is more convenient with major heavylifting (ordering and conflict handlings) outsourced to the server.
Yeah. Text based OT is pretty simple to implement, too. It’s about 200 lines of code, plus a simple network protocol. It’s fast and relatively straightforward to code up, and unlike text CRDTs it’s pretty fast by default.
I use it as my standard test when trying out a new programming language. It was unexpectedly ugly in go because of go’s lack of enums & unions, and that’s one of the big reasons I never got in to programming Go.
Operational transforms are one of those interesting technologies I’ve wanted to learn by implementing but haven’t made the time yet. I also didn’t realise it could be implemented in that little code.
Can you recommend any learning resources for implementing an OT? (Ideally typescript, python, or rust)
I haven’t explored this space in a while, but I have a couple of examples that could be helpful. A Clojure library of mine [0] has a decent README with some background reading on how to use operational transform.
I also reimplemented it in a surprisingly tiny amount of OCaml [1] which was a fun way to learn that language :)
> Yeah. Text based OT is pretty simple to implement, too. It’s about 200 lines of code, plus a simple network protocol.
Just for clarification, while OT for plain-text (linear model) might be simple to implement, OT for typical rich-text editors (like Google Docs) that need to rely on a tree-structured data model is a whole different story: https://ckeditor.com/blog/lessons-learned-from-creating-a-ri....
Well, interestingly josephg (parent commentor) was part of the team that made the original gdocs editor. And I guess he worked on the Google Wave OT implementation.
Thank god I didn't question what josephg wrote regarding the text-based OT :D
I'm actually part of the team that built real-time collaboration for CKEditor 5. As the article says, we use a tree-structured representation for rich-text data and decided (many years ago) to go with OT. My guess always was that GDocs also uses a tree structure as the internal data model or that at least Google Wave did. I think I based this on a comment from a Google employee who summed up their OT implementation as hard to stabilize, even over many years.
I know it's possible to kind of represent rich-text data in form of a linear structure. This makes the implementation of the OT much much simpler. But then the editor itself becomes much more limited or you need to combine multiple instances of its model to represent more complex data. E.g., AFAIK Quill (mentioned in the other comment) does not offer stable tables implementation. Something that wasn't that big of a deal for us.
Google Docs actually uses a flat array representation of the whole document. Tables/lists and other structures are encoded using control-characters in the plain string. This makes index calculations for OT a bit easier.
How do I know? I work on Zoho Writer - a google docs alternative and we use the same technique :)
Glad to meet you. Huge fan of CKeditor's modular architecture of the editor. Making something with that level of customization and supporting realtime collaboration is not an easy feat at all!
What are your thoughts on CRDT generally? Will CKeditor ever look into adopting CRDTs in the future?
nah, that’s not true at all. have a look at ‘rich-text’[1] which allows for transforms on metadata in a separate stream from the main content. it’s the same basic algo used for OT on plain text.
(i was the cto at a startup which used this to create a multi-user text editor with rich text support in 2015ish)
Actually, back in 2015 when we started prototyping CKEditor 5, we started with this approach as well. Our goal from the beginning was to combine real-time editing capabilities with an engine capable of storing and rendering complex rich-text structures (nested tables, complex nested lists, other rich widgets, etc.). We quickly realized that a linear structure is going to be a huge bottleneck. In the end, if you want to represent trees, storing them as a linear structure is counterproductive.
So, we went for a tree model. That got many things in the engine an order of magnitude harder (OT being one). But I choose to encapsulate this complexity in the model rather than make it leak to particular plugins.
I also talked to companies that built their platforms on top of Quill. One of them ended up gluing together countless Quill instances to power their editor and overcome the limitations of the linear data model but is now looking for a way to rebuild their editor from scratch due to the issues (performance, complexity, stability).
So, yes. You can implement a rich-text editor based on a linear model. But it has its immediate limitations that you need to take into consideration.
One of the main points of this article is to "just use locks", which glosses over a lot of technical complications about locking elements within a document. How long is the lock held? Can it be stolen from a user who has gone offline, or is still online but off to lunch, and we _really_ need to make this change before the presentation in an hour? What if the user comes back online and they have changes, but the lock was stolen - how are those changes reconciled to the document?
I am generally in favor of simpler is better, and if there is a way to build a collaborative experience without using CRDTs, then go for it. However, sometimes the cure can be worse than the disease, and solutions like locking may introduce more technical complexity than originally thought.
You're absolutely right, locking is a hard problem. Especially when you get into the edge cases of clients not disconnecting cleanly.
If you've ever used one of those awful early-2000s Microsoft word collaboration systems where you have to check-out the doc, and remember to check it back in, and no one can use it until you've checked it back in... it's awful!
I'm not directly in this team, but one of the teams at my company have been working on this problem. They call it "Spaces", and one of the features solves this component locking problem.
By exploring that locking and unlocking mechanism, you will find that the logical conclusion in the end, when enough complexity and edge cases get covered/fixed as bugs, that it turns into a crude form of "CRDT" (where it's not actually consistent, but merges within reason for 99% of use cases).
For however long the user has a browser focus state on the element seems like a reasonable answer, and submit changes as they are made. However, I don't know how you resolve conflicts of two users simultaneously attempting to acquire a lock.
The server must keep track of the locks, and it can only know about the lock being released if the client tells it. E.g by sending a message that the field is not focused any more. The tricky thing is in the "edge" cases, or really the non-perfect cases, which there are plenty of (as I think GP described).
The server can decide that the client is offline if the server misses expected heartbeat messages from the client. But how often will those be sent and how long grace period will we allow? If it's too short then it will be unreliable on shaky 4G connections, if it's too long then it will be annoying in the other direction.
And that's not considering the "social" problems with locks. I've worked on replacing a system that was lock-based with CRDTs where the lunch scenario from MontagFTB actually was a common occurrence.
In an "ideal" scenario your lock acquisition problem is not hard. Client's just show the UI optimistically and whoever the server decide was first gets the lock. The loosing client throws any state the user created in the short time-frame. Over reliable and fast connections for granular locks, this works fine. But in the real world that's just one of the issues with a lock based approach…
Having used an app with locks like this (Quip) I can tell you it really sucks to have a lock lease >10s after the last edit. The “lock” should be a UI level concept anyways - clients should broadcast “hey I’m taking field X”; if you have two users submit a write+lock request concurrently you’ll need to pick a winner anyways.
> However, I don't know how you resolve conflicts of two users simultaneously attempting to acquire a lock.
It turns out you just have to pick one... This all depends on a source of truth, and when you are there it's easy to pick one, say based on whichever arrived at the network interface first.
I don't think you need a pure CRDT either but I think locking and presence is a bit of an oversimplification.
LWW is a good place to start, and updating the smallest piece of information possible is the right idea in general but there is a lot more nuance to handling complex applications like a spreadsheet (I'm working on one) and whiteboard apps.
Things like reparenting or grouping shapes [1], or updating elements that aren't at the lowest scale like deleting a row or column in a spreadsheet make locking challenging to implement. Do you lock the entire row while I'm moving it? Do you lock the entire group of shapes?
With the exception of text editing, the popular libraries like Yjs don't just give you a perfect CRDT out of the box. You still have to construct your data model in a way that enables small scale updates [2], and CRDT libraries and literature are the best source of thinking for these problems that I've found.
That's true for collaborative experience. Crdts are a mechanism to handle eventual consistency (that's even the preface of the paper). If you assume that said collaborative experience is always online, you don't need them, and "using locks" as you described is probably enough.
If you want a mechanism to handle that eventual consistency, it's probably better to reuse their principles rather than reinventing something that will eventually ressemble Crdts.
You may not need CRDT per-se, but building a collaborative experience is so difficult. I worked on collaborative systems for a bit, and also have read a bit about how Figma and Notion do it (this is a good read: https://www.figma.com/blog/how-figmas-multiplayer-technology...) -- it's still super hard to get right.
> Ever-growing state: for CRDTs to work well they need to keep a record of both what exists, and what has been deleted (so that the deletes aren’t accidentally added back in later). This means that CRDT state will continually expand.
I guess a couple things:
It depends on the CRDT. Some CRDTs grow with the number of replicas and others with the number of events.
State-based CRDTs don't need to keep history and don't need causal ordering of messages, but internal bookkeeping grows with the number of replicas. And for large states (like sets and maps), it can be prohibitive to send the state all over the wire for an idempotent merge.
That's why in practice, people implement Op-based CRDTs, which makes the trade: in order to send small ops over the wire, we now need causal ordering of messages. To make sure we can sync with replicas long offline, we keep as much history so that they can catch up.
There are other variations, such as delta-state based CRDTs that send diffs, and merkle CRDTs, which use merkle data structures to calculate diffs and detect concurrency, which have different growth characteristics.
---
As for a growing state: Is this actually a concern for devs that aren't using CRDTs for collaborative text? I can see that being an issue with the amount of changes that can happen.
But outside of that, lots of data don't grow that fast. We all regularly use Git and it keeps a history of everything. Our disks are huge, and having an immutable record is great for lots of things (providing you can access it).
> Opaque state: ...you’re generally left with an opaque blob of binary encoded data.
Most CRDT libraries take a document-orientated angle. It assumes that you can contain the entire "unit of work", like a document, inside of a CRDT. However, if your data is more relational, it doesn't quite fit. And while there's immutable data in a CRDT, I do wish it was more accessible and queryable. In addition, being a binary blob, it's not exactly composable. I think CRDT libraries should be composable with each other.
I've seen locks used at the CMSes of large news organizations. It's fine, but they all need a mechanism to kick out an editor who has an idle tab left open. For my own small scale CMS, I just wrapped Google Docs and let them handle all the syncing headaches.
We took a super simple (IMO) approach to collaborative editing in my current project:
Each block of text has a version number which must be incremented by one by the client at the time of submission. The database provides conflict prevention by uniqueness constraint which bubbles up to the API code. The frontend is informed of conflict, so that the user can be notified and let the human being perform conflict resolution.
Because most concurrent users are working on different blocks, this works great.
How do you handle getting the changes that one client makes onto the other clients? Are you pushing it from the server to the clients with websockets, or waiting for the clients to ask for new info, or waiting for the conflict to happen when someone else tries to make a change, or something else?
I'm thinking a lot about keeping server and client data in sync while working on our hopefully-soon-to-be-released LiveSync product[1]
That's not gonna work for real-world projects. Real-world apps often have larger edits than locking individual cells/cards e.g. Move columns or replace large chunks of spreadsheets in Google Sheets, or Ctrl-A to select all and then drag to move.
Also, if you consider latency, locking does not work well because client B might do operations before he/she even acknowledges the lock from client A because of latency.
> You can’t inspect your model represented by the CRDT without using the CRDT library to decode the blob, and you can’t just store the underlying model state because the CRDT needs its change history also. You’re left with an opaque blob of data in your database. You can’t join on it, you can’t search it, you can’t do much without building extra features around that state blob.
So use the CRDT library when building your indices? Or better yet use a CRDT-aware datastore. This doesn't seem like a real problem.
> Locking for safety
Please don't. You're inevitably going to have lost locks, lost updates, or most likely both.
> So use the CRDT library when building your indices?
Yeah, sure, you can build a secondary index over the data. But you're still having to decode the blob and index it. There's no version where you can index the data without using the library to expose the underlying model state (like you could if you weren't using a CRDT).
On locking, yes, it's hard. But it's not the same kind of locking that you'd expect in other parts of a system. You're locking the UI, not the actual data, so it's a tiny bit more forgiving. In general the locks aren't trying to force consistency, instead they are trying to prompt the humans to reduce the chance of conflict happening in the first place. Ofc, you still have to care about the locking, unlocking, disconnection problems, etc.
Here's a decent example/demo you can play around with in multiple windows:
> Yeah, sure, you can build a secondary index over the data. But you're still having to decode the blob and index it. There's no version where you can index the data without using the library to expose the underlying model state (like you could if you weren't using a CRDT).
How's that different from any other datastructure? You're never indexing the raw underlying bytes and you wouldn't want to. Like, sure, if you're using a datastore where indexing is built in then you need to make sure that your datastore understands the datastructures you're using, but that's always a problem that you have. Most of the CRDT infrastructure I've seen has been designed around having a datastore that understands the CRDT and building on top of that.
> On locking, yes, it's hard. But it's not the same kind of locking that you'd expect in other parts of a system. You're locking the UI, not the actual data, so it's a tiny bit more forgiving. In general the locks aren't trying to force consistency, instead they are trying to prompt the humans to reduce the chance of conflict happening in the first place.
I don't think that actually improves matters, it just means more edge cases? Either you have a locking system that's 100% consistent and people can lock each other out and deadlock, or you don't and you end up with lost updates that will be even more infuriating because the user checked whether anyone else was editing and it looked like they weren't.
For offline first apps, or for applications where very high degree of control for the content is needed (e.g. legal docs) and realtime collaboration isn't that valuable, there is also the option to use 3-way merge instead.
The benefit is that you can even allow the user to resolve conflicts in a satisfactory way.
Another benefit is that the document doesn't even have to be derived from the original, it could go through exports and re-imports and it will still be possible to run a 3-way merge as long as a common base version is declared. This can be especially covnenient for systems that involve e.g. MS Word.
> Opaque State: [...] You can’t inspect your model represented by the CRDT without using the CRDT library to decode the blob, and you can’t just store the underlying model state because the CRDT needs its change history also. You’re left with an opaque blob of data in your database.
As someone who works on a CRDT library with opaque state [1], I agree that this is a big barrier to adoption. Features like partial loading, per-paragraph permissions, and accept/reject suggestions seem pretty easy to implement if each text char is just a row in your server's DB, but I would have trouble implementing them on top of e.g. Yjs.
For text editing, one idea is to separate the CRDT "positions" from the text itself, which you can then store as a map (position -> char) in your own data structures. I've made a simple (but inefficient) library along these lines [2] and would be interested in ideas for further development.
I think the most important part of designing collaborative software, which this touches on a bit, is having a the right granularity and scope of a given change.
Last-writer-wins is only bad when the granularity of what you're editing is too big. E.g. if you're an editor like Figma and each element is a row in a database, a single row is too big. Instead you want attribute level granularity so two users can change the independent properties (like one color and the other size) without bulldozing each other.
The other key thing (that's also a common mistake) is to only consider realtime collaboration. In practice, there's always some delay (maybe just milliseconds but could be be hours or days) in how events propagate so solutions like locking don't work.
The reality is that any client-server system that needs to be highly interactive and robust to unreliable network conditions is undeniably a distributed system and therefore warrants using distributed system solutions like vector clocks, Lamport timestamps, CRDTs, etc.
Last thing is that I think many people only think of operation-based CRDTs when they think about CRDTs. You can (and we have at my company) created a fairly traditional feeling database that relies on a state-based CRDT solution that doesn't need to maintain a log of every operation that has every happened.
So yes, you might not need to reach for a fancy library like Yjs or Automerge, but it's worth understanding how these things thinks basically work because many of them are extremely simple and easy to grok - the complicated parts of Yjs and Automerge are the sophisticated data-structures and algorithms that are pretty much only needed for large document text editing.
The no-code serverless platform I built achieves this behind the scenes via a real-time CRUD API: https://saasufy.com/
The key is to perform updates on fields individually. Normally this would not be viable using HTTP due to headers/overheads (too many fields per resource to dedicate an entire HTTP request per-field) but it is viable over WebSockets as each frame is very lightweight and can even be batched. Also, being able to tie together the life of the connection to the subscription is handy to ensure that no real-time updates can be missed.
I built a chat app with authentication + access control with it (you can log in with GitHub at the bottom):
We are addressing the CRDT downsides mentioned in the article at Loro:
- Ever-growing state. This is no longer an issue. With OT-like CRDTs, you can discard unnecessary historical data at any time. This is theoretically feasible, and we are moving towards this goal.
- Complex implementation. The complexity is internal within the package, and it's written in Rust, making it universally applicable.
- Opaque state. We aim to expose these internal states through improved DevTools, making them easier to control and observe. This is one of the essential steps in enhancing our DX.
It's true that a CRDT is often not the right thing for a classic client/server application. But this doesn't mean we should just give up on ux and use locking.
There are approaches to multiplayer that are client/server native. By leveraging the authoritative server they can offer features that CRDTs can't, while preserving the great ux.
> But this doesn't mean we should just give up on ux
It's a little unfair to describe locking as "[giving] up on UX", especially given some well known collaboration products use it quite successfully; Google Sheets cell locking, Miro element/Text locking, etc.
Ofc, it's going to depend on the scope of what's being locked. These two examples are quite finely scoped element locks.
> I’ll run through a bunch of broad categories of applications, and describe how to make use of these features.
i love these kinds of taxonomies of apps, because then you can get specific about tech stack choices. just offering a couple more that i've come across in my years:
At this point, given the maturity of libraries (I was exploring this recently), I think you'd have to make the case that CRDTs are bad not just "too much".
Interfacing with the 'blob' is a real thing (y-js is solving some of this with a rust implementation that has cross language binding) but generally the things they noted (e.g. a Figma canvas) aren't things you commonly do joins across and if you did you'd have an independent indexing store for that functionality.
With tools like SyncedStore [1] and HocusPocus [2] you end up with a pretty good, we'll tested, easy to implement base for good collaboration.
The Wikipedia page for Operational Transformation [1] mentions Differential Synchronization [2] as an alternative, does anyone have any experience with DS?
When you search for usage of Differential Sync you'll find lots of news reports about it being included as technology in file sync and file backup solutions.
I've not directly worked on it, but I get the impression that it's mostly used in file backup because of how Differential Sync works. In Differential Sync the server and the client both hold two copies of the data/document. One representing their own state, and one representing what they thing the other's state is. Then the differences are computed between those documents and sent to the other.
For any given client, this isn't too onerous. You'd have to hold the client and server copy of the doc. But for a server, serving many clients, you'll end up with a lot of client-specific copies of that document. For N clients, you'd have N+1 copies of the document. It doesn't take too many clients before this becomes a real problem.
So if you're backing up 1 client to 1 server, Differential Sync is fine. But if you're trying to keep tens, hundreds, or thousands of clients in sync then you end up with a lot of client "shadow" copies of the server's document.
Developers has solved this is most VCS by merging changes. The trouble with merging is that it's hard to explain and hard to visualize for non technical users. Once that problem is solved, there's a lot of nice tools to use for collaborative experiences.
> Ever-growing state: for CRDTs to work well they need to keep a record of both what exists, and what has been deleted (so that the deletes aren’t accidentally added back in later). This means that CRDT state will continually expand. There’s a bunch of magic that CRDT library authors are doing with clever compression techniques to make this problem less-bad, but it’s basically in-escapable. The size of your CRDT state is not purely a function of the size of the state the CRDT represents, but also of the number of updates that state has gone through.
A) This is only the case for certain CRDTs, such as sets that support deletion - so, if you want Set semantics with deletion support, you need two sets, one to that tracks all deletions and one that tracks all insertions.
B) You can garbage collection your sets. They don't have to grow forever.
> Complex implementations: CRDTs are easy to implement wrong, so probably don’t roll your own.
Personally, I've never done this. I've just added a `merge(&mut self, other: &Self)` method to structs in Rust. Guaranteeing CRDT properties is often trivial, or at least it was in my case.
> Opaque state: Because the CRDT has to represent both the underlying state and the updates that led to that state
Again, this is only if you need specific operations on your CRDTs and if your CRDTs are encoded in specific ways.
I've said it before, but a trivial crdt looks like this
et voila? Obviously you lose all intermediary states, but since that is specified to be a negative thing, I just want to be clear that it's often optional.
> So maybe you are convinced that CRDTs are not the be-all-and-end-all of collaboration, and that you aren’t in one of the two categories where you probably should use a CRDT, and you’ve made it this far in the post.
I am convinced that CRDTs are not the be-all-and-end-all, because Strong Eventual Consistency does not provide strong enough guarantees for all use cases.
Once again we have a CRDT article that's about user collaboration, which I find somewhat frustrating because CRDTs can be used in far more places than that, and user collaboration is like the most complicated thing you could ever write since it's all of the problems of a distributed system and then we add humans into the mix. There is no "good" solution to this problem - CRDTs aren't going to solve it, and neither is any other algorithm, because it's not possible to encode every possible state update in a way that never conflicts and is also what a human expects (especially since humans have varying expectations).
The algorithm/ approach, as described, seems perfectly fine - it will have edge cases just like CRDTs will. In reality, for such an impossibly complex problem, you're probably going to end up with something really complex to solve it. You're almost certainly going to start adding CRDT-like operations, like "ok technically this user held a lock on X, but the other user performed an operation on X that technically commutes, so we can allow both" to alleviate some of the inherent complexities (and UX issues) with locking.
I think we can. Following all the implications down the rabbit hole, you end up with a system architecture where there's no front-end, and there's no back-end.
Since CRDTs are still kinda new for most people, the discussion hasn't really gotten there yet.
And wrangling CRDTs into a client-server architecture is actually extra work. The basic design assumes you can trust every peer -- making sure that jes5199 actually was the peer who added a comment that is marked as author: "jes5199" is not trivial.
Conflict-Free Replicated Data Type (CRDT) is a type of data structure that enables concurrent updates across multiple replicas without the need for coordination between them.
Notion is “collaborative” and we don’t use a CRDT for text, it’s all last-write-wins decided by the server. However our LWW texts are individually small - one block/paragraph in size - and adding/moving/removing blocks is intention-preserving if not perfectly convergent.
As the article says, the downside for LWW is that “offline” / async collaboration isn’t so great. That’s why we’re working on switching to CRDT for our texts. If you’re interested in bringing CRDTs to a product with a lot of users, consider joining Notion’s Docs team - https://boards.greenhouse.io/notion/jobs/5602426003 / @jitl on Twitter / jake@makenotion.com