If the server can't operate on the content, it can't merge it into the CRDT documents. Which means it would need to sending and receiving the entire state of the CRDT with each change.
If the friend is online then sending operations is possible, because they can be decrypted and merged.
Generally, this is not really true. The point of CRDTs is that as long as all parties receive all messages (in any order), they should be able to recreate the same state.
So instead of merging changes on the server, all you need is some way of knowing which messages you haven’t received yet. Importantly this does not require the server to be able to actually read those messages. All it needs is some metadata (basically just an id per message), and when reconnecting, it needs to send all the not-yet-received messages to the client, so it’s probably useful to keep track of which client has received which messages, to prevent having to figure that out every time a client connects.
You’re talking past each other. These are both valid descriptions of CRDTs - just different types of CRDTs.
Generally there’s two categories of CRDTs: state based and operation based CRDTs.
State based CRDTs are like a variable with is set to a new value each time it changes. (Think couchdb if you’ve used it). In that case, yes, you generally do update the whole value each time.
Operation based CRDTs - used in things like text editing - are more complex, but like the parent said, deal with editing events. So long as a peer eventually gets all the events, they can merge them together into the resulting document state. CRDTs have a correctness criteria that the same set of operations always merges into the same document, on all peers, regardless of the order you get the messages.
Anyway, I think the parent comment is right here. If you want efficient E2E encryption, using an operation based crdt is probably a better choice.
If it takes 1 seconds per merge as per the article it sounds like a poor user experience for when new people join they have to wait hundreds or thousands of seconds to get to the doc.
I... still can't make heads or tails out of this description. Let me restate how I understand the scheme in TFA: there are two people, editing on the same document using CRDTs. When one person makes an edit, they push an encrypted CRDT to the sync server. Periodically, each of them pulls edits made by the other from the sync server, apply them to their own copy, and push the (encrypted) result back. Because of CRDT's properties, they both end up with the same document.
This scheme doesn't require them two people to be on-line simultaneously — all updates are mediated via the sync server, after all. So, where am I wrong?
I think the difference in understanding is that the article implies, as I understand it, that the server is applying the changes to the document when it receives a change message, not the clients.
If the clients were applying the changes then we don't need Homomorphic encryption in the first place. The server would just store a log of all changes; cleaning it up once it was sure everyone played the changes if that is possible.
Without Homomorphic encryption, the server must store all changes since some full snapshot and a full snapshot of the document. Where as with it, the server only ever stores the most recent copy of the document.
This could be done to reduce the time required for a client to catch up once it comes online (because it would need to replay all changes that have happened since it last connected to achieve the conflict free modification). But the article also mentions something about keeping the latest version quickly accessible.
One way to solve this is end-to-end encryption. You and your friend agree
on a secret key, known only to each other. You each use that key to encrypt
your changes before sending them, decrypt them upon receipt, and no one in
the middle is able to listen in. Because the document is a CRDT, you can
each still get the latest document without the sync server merging the
updates.
That is indeed a solution,
but then for some reason claims that this schemes requires both parties to be on-line simultaneously. No, it doesn't, unless this scheme is (tacitly) supposed to be directly peer-to-peer which I find unlikely: if it were P2P, there would be no need for "the sync server" in the first place, and the description clearly states that in this scheme it doesn't do anything with document updates except for relaying them.
Hi, author here! The scenario was meant to be high level — I guess I should have gotten more into the various architectures and tradeoffs, but the article is already pretty long.
The way I see it there are a couple of ways this can shake out:
1. If you have a sync server that only relays the updates between peers, then you can of course have it work asynchronously — just store the encrypted updates and send them when a peer comes back online. The problem is that there's no way for the server to compress any of the updates; if a peer is offline for an extended period of time, they might need to download a ton of data.
2. If your sync server can merge updates, it can send compressed updates to each peer when it comes online. The downside, of course, is that the server can see everything.
Ink & Switch's Keyhive (which I link to at the end) proposes a method for each peer to independently agree on how updates should be compressed [1] which attempts to solve the problems with #1.
> The problem is that there's no way for the server to compress any of the updates; if a peer is offline for an extended period of time, they might need to download a ton of data.
There are ways to solve this, using deterministic message chunking. Essentially clients compress and encrypt “chunks” of messages. You can use metadata tags to tell the server which chunks are being superseded. This is fast and efficient.
Alex Good gave a talk about how he’s implementing this in automerge at the local first conference a few weeks ago:
I would really love this to be added to the article (or a followup), since it was my conclusion as well, but most readers are going to be thinking the same thing.
I'd also love to know how balancing the trade off of compute time between FHE and the bloat of storing large change sets affects latency for online and offline cases.
Perhaps, as with many things, a hybrid approach would be best suited for online responsiveness and offline network and storage use?
Admittedly, I haven't read the linked research papers at the end. Perhaps they have nice solutions. Thanks for that.
There's another option: let the clients do the compression. I.e. a client would sign & encrypt a message "I applied messages 0..1001 and got document X". Then this can be a starting point, perhaps after it's signed by multiple clients.
That introduces a communication overhead, but is still likely to be orders of magnitude cheaper than homomorphic encryption
The protocol half of most real world CRDTs do not want to send the raw stream of changes. They prefer to compress changes into a minimal patch set. Each patch set is specific to individual peers, based on the state of their local CRDT at merge time.
The naive raw stream of changes is far too inefficient due to the immense amount of overhead required to indicate relationships between changes. Changing a single character in a document needs to include the peer ID (e.g., a 128-bit UUID, or a public key), a change ID (like a commit hash - also about 128-bit), and the character’s position in the document (usually a reference to the parent’s ID and relative marker indicating the insert is either before or after the parent).
The other obvious compression is deletions. They will be compressed to tombstones so that the original change messages for deleted content does not need to be relayed.
And I know it is only implied, but peer to peer independent edits are the point of CRDTs. The “relay server” is there only for the worst case scenario described: when peers are not simultaneously available to perform the merge operation.
So, there is a reason that CRDT researchers would not like this response that you have given, but down-thread from you it's not why the author jakelazaroff didn't like it, but it's worth giving this answer too.
The reason CRDT researchers don't like the sync server is, that's the very thing that CRDTs are meant to solve. CRDTs are a building-block for theoretically-correct eventual consistency: that's the goal. Which means our one source-of-truth now exists in N replicas, those replicas are getting updated separately, and now: why choose eventual consistency rather than strong consistency? You always want strong consistency if you can get it, but eventually, the cost of syncing the replicas is too high.
So now we have a sync server like you planned? Well, if we're at the scale where CRDTs make sense then presumably we have data races. Let's assume Alice and Bob both read from the sync server and it's a (synchronous, unencrypted!) last-write-wins register, both Alice and Bob pull down "v1" and Alice writes "v1a" to the register and Bob in parallel writes "v1b" as Alice disconnects and Bob wins because he happens to have the higher user-ID. Sync server acknowledged Alice's write but it got lost until she next comes online. OK so new solution, we need a compare-and-swap register, we need Bob to try to write to the server and get rejected. Well, except in the contention regime that we're anticipating, this means that we're running your sync server as a single-point-of-failure strong consistency node, and we're accepting the occasional loss of availability (CAP theorem) when we can't reach the server.
Even worse, such a sync server _forces_ you into strong consistency even if you're like "well the replicas can lose connection to the sync server and I'll still let them do stuff, I'll just put up a warning sign that says they're not synced yet." Why? Because they use the sync server as if it is one monolithic thing, but under contention we have to internally scale the sync server to contain multiple replicas so that we can survive crashes etc. ... if the stuff happening inside the sync server is not linearizable (aka strongly consistent) then external systems cannot pretend it is one monolithic thing!
So it's like, the sync server is basically a sort of GitHub, right? It's operating at a massive scale and so internally it presumably needs to have many Git-clones of the data so that if the primary replica goes down then we can still serve your repo to you and merge a pull request and whatever else. But then it absolutely sucks to merge a PR and find out that afterwards, it's not merged, so you go into panic mode and try to fix things, only for 5 minutes later to discover that the PR is now merged. And if you've got a really active eventually consistent CRDT system that has a lot of buggy potential.
For the CRDT researcher the idea of "we'll solve this all with a sync server" is a misunderstanding that takes you out of eventual-consistency-land. The CRDT equivalent that lacks this misunderstanding is, "a quorum of nodes will always remain online (or at least will eventually sync up) to make sure that everything eventually gets shared," and your "sync server" is actually just another replica that happens to remain online, but isn't doing anything fundamentally different from any of the other peers in the swarm.
> Which means it would need to sending and receiving the entire state of the CRDT with each change.
> If the friend is online then sending operations is possible, because they can be decrypted and merged.
Or the user's client can flatten un-acked changes and tell the server to store that instead.
It can just allways flatten until it hears back from a peer.
The entire scenario is over-contrived. I wish they had just shown it off instead of making the lie of a justification.
There are variants of CRDTs where each change is only a state delta, or each change is described in terms of operations performed, which don't require sending the entire state for each change.
If the friend is online then sending operations is possible, because they can be decrypted and merged.