Automerge 2.0

josephg · on Jan 31, 2023

Congratulations to the automerge team! This is a fantastic accomplishment.

I particularly enjoy the performance improvements. I benchmarked automerge 18 months ago and the benchmark took about 5 minutes to process the edits of a paper. Some single character inserts took as much as 2 seconds of cpu time. From the article it looks like this entire editing trace (of 260000 keystrokes) is down to 600ms. That’s a huge improvement. It means automerge is similar in performance to yjs, and in turn that makes automerge useful for a much broader set of applications.

One thing I really enjoy about the collaborative editing space is how much ideas are shared around. The highly compact binary encoding was done first in automerge, then copied and tuned in yjs and diamond types. The idea of using an internal list rather than a tree went the other way - yjs came up with the idea, and that approach has landed in all production sequence CRDTs that I know about.

There’s a bunch of work in the pipeline around non-interleaving, BFT properties, database interoperability and more performance tuning that we are (collectively) still figuring out. But the future of CRDTs seems bright. In a few years I’d love all new software to be built on local first fundamentals. Work like this is how we get there.

To everyone involved, great work! Keep it coming!

jamil7 · on Jan 31, 2023

Thanks for all your work on performance, my side project is a rich text CRDT in Swift which wraps AttributedString. I took a lot of inspiration from Peritext and used your blog post extensively for performance tuning.

nugmanoff · on Jan 31, 2023

hey, it sounds super cool! mind sharing the link? definitely something I'd love to use

jamil7 · on Jan 31, 2023

Haven’t open sourced it yet but will definitely post to HN at some point.

munhitsu · on Jan 31, 2023

shameless plug :) you can steal whatever you want from: https://github.com/munhitsu/CRAttributes https://github.com/munhitsu/CRAttributesDemo

satvikpendem · on Jan 30, 2023

See also, Autosurgeon (with a 0.3.0 release today), which is a higher level API on top of Automerge for Rust:

I'm building a mobile app with a server backend, and I was looking for resources to build them in an offline-first way (since unlike on the browser, people expect to use apps offline, if they can, such as fitness or habit trackers).

I found the concept of conflict-free replicated data types (CRDTS) interesting as it allows you to have fully offline experiences while also having a conflict-free syncing experience. I was looking for some good libraries and came across automerge [0] and yrs [1], but both had some rough APIs as they're primarily low-level Rust libraries that are wrapped by higher-level TypeScript APIs.

Autosurgeon wraps the low-level API of automerge to make it much more ergonomic, closer to the TypeScript experience, but in Rust of course. You can for example use `struct`s which autosurgeon will serialize and deserialize automatically, which is not present in base automerge, which focuses more on string keys and arbitrary values.

I am planning on using this together with Flutter and flutter_rust_bridge [2] in order to use this same Rust library everywhere. In this case, the server just becomes another (albeit more privileged) client.

[0] https://github.com/automerge/automerge-rs

[1] https://github.com/y-crdt/y-crdt

[2] https://github.com/fzyzcjy/flutter_rust_bridge

KRAKRISMOTT · on Jan 30, 2023

Be careful when using CRDTs. Having no conflicts does not mean the end result is correct. In many cases you essentially converge to last-write-wins with respect to the Lamport clock.

LAC-Tech · on Jan 31, 2023

Generally agree with this. There is no magic solution to resolving conflicts in multi master systems (despite what some database marketing may imply). CRDTs are predictable but they are 'dumb' in how they automatically merge. Make sure the outputs are likely to make sense for your problem domain.

api · on Jan 31, 2023

The fundamental thing is that no merge or consensus algorithm can somehow telepathically know the real world intent of its users.

CRDTs can best be thought of as a way to eliminate spurious and false conflicts, leaving only real errors. Without them anyone who has ever coded a data merge knows you tend to get a ton of noise.

So basically you have reduced the problem surface area.

LAC-Tech · on Jan 31, 2023

Can you expand on "spurious and false conflicts" here?

CGamesPlay · on Jan 31, 2023

Not the OP, but I'm guessing he's referring to, for example, two users each correcting a typo in a different location in the document. From the perspective of the text CRDT, there's no conflict, and users are likely to agree. Raising a "file edited simultaneously, choose which version to use" error would be a "spurious and false conflict" in this sense.

Note that from a different user perspective, say a code document, such a conflict is actually correct and desired. So it's all about context.

pvh · on Jan 31, 2023

I prefer to think of Automerge as a form of version control: because the full history is retained, if you don't like the merge you can decide what you want to do instead.

satvikpendem · on Jan 31, 2023

In automerge's (and usually any CRDT implementation's) case, if it encounters a merge conflict, it will allow you to handle it with a custom merge function. So it's not necessarily that CRDTs are truly "conflict-free," just that it will merge correctly in all other cases than editing the same value at the same time.

CGamesPlay · on Jan 31, 2023

While this is true, the base "text" CRDT generally does the right thing for user documents, and conflicts are generally handled reasonably (though it's fair to say a bad conflict would not be automatically resolved "correctly"). Yjs (not Automerge) also has an XML CRDT, which extends the text CRDT to always have correct XML syntax (although again, which text falls into the <em> and which text falls outside of it may not be "correct" in the case of a conflict).

samstave · on Jan 31, 2023

Is it possible to have a selectable roll-back/diff feature such that if the sync goes through - the originals on both sides have a 'backup'/source-of-truth option such that you can revert easily?

ChadNauseam · on Jan 31, 2023

Yes, the full history is always retained

paulgb · on Jan 30, 2023

Autosurgeon repo: https://github.com/automerge/autosurgeon

pugio · on Jan 30, 2023

Thanks for the links, this is pretty interesting stuff. Just a quick note: it's Conflict Free Replicated Data Types - not Relational.

satvikpendem · on Jan 30, 2023

Yep you're right, fixed.

crabmusket · on Jan 31, 2023

This is neither here nor there, but I've always preferred "convergent replicated data-types", as there is some confusion about what the "true acronym" was intended to be.

abiro · on Jan 31, 2023

Another thing to keep in mind is that if you want the data to be end-to-end encrypted, then you need both devices to be online at the same time to sync with Automerge.

jitl · on Jan 31, 2023

Big congratulations to the Automerge folks for shipping this after years of work. At this point both Yjs and Automerge have Rust libraries and (soon?) bindings for more languages than just JS that stay in sync.

Yjs (pure javascript?) is quoted on the paper benchmark at 1,074ms and 10,141,696 bytes of memory, compared to Automerge 2.0.2-unstable at 661ms and 22,953,984 bytes of memory. It looks like Automerge 2 latest is faster than Yjs, but still uses 2x more memory.

I wonder if this is comparing usage from JS via bindings, or directly comparing two different rust implementations, or comparing Automerge 2.0.2-unstable via Rust to Yjs via NodeJS.

I am still not sure which set of tools I would recommend; I believe Yjs is more actively deployed in production since the Automerge implementation was so far behind performance wise until now. However one of the Peritext authors (https://twitter.com/sliminality who is on my team at Notion) tells me that Automerge is better at text because it doesn't suffer from interleaved characters like Yjs does. So consider it instead of Yjs!

josephg · on Jan 31, 2023

I’ve spent a lot of time benchmarking both libraries and talking to the authors. The main difference is that yjs has an extra optimization that’s still missing from automerge: Yjs does internal run-length encoding of adjacent inserted items. And adjacent inserts come up a lot in real text editing traces.

Adding this optimization to diamond types, in pure rust, improved performance by another order of magnitude (25ms for the same test with this tweak). It also dropped memory usage to about 2MB. The automerge engineers know about this trick (I’ve talked to them about it). So I assume it’s in the pipeline somewhere. And yjs is working on a rust reimplementation, which should bring its performance in line too.

memorythought · on Jan 31, 2023

This optimization is indeed in the pipeline, although there are other things nearer the front because performance is currently Good Enough ™ that other things are more pressing (other things being e.g. completing the Peritext implementation, improving the sync protocol).

satvikpendem · on Jan 31, 2023

diamond-types (for reference for others [0]) still only supports plain text, is that right? I was thinking of using it for more general use cases such as an offline habit tracker, which isn't text of course, but I was interested to hear more on the progress towards other data types such as generic JSON data.

Currently for this use case I've been using autosurgeon [1] so far which has a nice Rust API for structs, even if it might be slower than yjs (or yrs, its Rust implementation) or diamond-types.

[0] https://github.com/josephg/diamond-types

[1] https://github.com/automerge/autosurgeon

josephg · on Jan 31, 2023

Yep; sadly still true. I started some work last year to simultaneously add support for arbitrary JSON data and add a database-like storage layer to allow us to safely stream changes to disk. (Automerge and yjs usually require the entire data set to be re-saved in its entirety when updates happen). Its taken longer than I thought, because I've gone through a bunch of different designs for both pieces. We'll get there; everything just takes longer than you want when you do it for the first time.

I'll look at autosurgeon. Having similar APIs is good for everyone.

meitros · on Jan 31, 2023

Was also curious how the comparisons were being done. Isn’t there a new non-negligible overhead of converting and passing data structures between js and rust?

josephg · on Jan 31, 2023

I've gotten similar performance from yjs (about 1 second from this test). To do it, I ran the benchmark itself in nodejs / V8. When I benchmarked in rust, the same data set was loaded from JSON and benchmarked using pure rust code.

Its not a fair test if you'll be running this code in a browser, since the rust code will need to be compiled to wasm (and suffer a ~3x performance penalty), while the javascript code will run at the same speed. But whether that matters to you depends on what you're doing.

kevincox · on Jan 31, 2023

3x semms unusually high. It definitely depends what you are doing but I recently compared a Ricochet Robots solver that I wrote (A* search) and it took only about 110% of native time. When I benchmarked it a couple of years ago it took about 2x so things have definitely improved a lot.

I'm sure the use case matters a lot. But at least in come cases the result can be very close to native.

lll-o-lll · on Jan 31, 2023

Wow, 3x perf penalty? I thought wasm was supposed to get us to “near native speeds”. Is this typical of the penalty paid, or something specific to automerge?

josephg · on Jan 31, 2023

Yep. At least, thats the slowdown rate I saw porting my own (very optimized) CRDT to wasm last year. I haven't measured automerge's wasm build but 3x slower is a reasonable baseline for wasm's performance compared to x86_64 code.

Some of that difference is a lack of auto-vectorization in wasm. Wasm SIMD is pretty new, and not well supported in wasm runtimes yet as far as I know.

dang · on Jan 31, 2023

Automerge: A JSON-like data structure (a CRDT) that can be modified concurrently - https://news.ycombinator.com/item?id=30412550 - Feb 2022 (69 comments)

Automerge: a new foundation for collaboration software [video] - https://news.ycombinator.com/item?id=29501465 - Dec 2021 (29 comments)

Automerge: A library [..] for building collaborative applications in JavaScript - https://news.ycombinator.com/item?id=24791713 - Oct 2020 (1 comment)

Automerge: JSON-like data structure for building collaborative apps - https://news.ycombinator.com/item?id=16309533 - Feb 2018 (98 comments)

satvikpendem · on Jan 31, 2023

I'd also add

- Local First Software [https://news.ycombinator.com/item?id=31594613 (28 comments)] by Martin Kleppmann (who works on Automerge at the company Ink and Switch, perhaps better known as the author of Designing Data Intensive Applications), which introduces Automerge

- CRDTs: The Hard Parts [https://news.ycombinator.com/item?id=23802208 (124 comments)], a video talk also by Kleppmann

- CRDTs go brrr, 5000x faster CRDT implementations [https://news.ycombinator.com/item?id=28017204 (151 comments)], by the creator of another CRDT in Rust library, Diamond Types [https://github.com/josephg/diamond-types]

sirodoht · on Jan 31, 2023

So exciting! Strangely enough, a couple of hours before this release, we just managed to wrap our heads around Yjs after playing with it on and off for a few weeks!

For anyone not up to date with the world of CRDTs, Seph Gentle's two blog posts have become legendary:

* https://josephg.com/blog/crdts-are-the-future/

* https://josephg.com/blog/crdts-go-brrr/

these are also worth checking out:

* https://github.com/y-crdt/y-crdt (rust implementation started by the creator of Yjs, Kevin Jahns)

* https://github.com/y-crdt/ypy (python bindings for the rust implementation)

* https://github.com/josephg/diamond-types (Seph Gentle's rust implementation of YATA, the algorith behind Yjs)

mkl · on Jan 31, 2023

Some big past HN threads on those blog posts:

CRDTs are the future https://news.ycombinator.com/item?id=24617542 312 comments, https://news.ycombinator.com/item?id=31049883 45 comments

Faster CRDTs: An Adventure in Optimization https://news.ycombinator.com/item?id=28017204 151 comments, https://news.ycombinator.com/item?id=33903563 22 comments

the_duke · on Jan 31, 2023

As it stands CRDTs are only really useful for a narrow subset of data. Data is guaranteed to converge, but there is no guarantee that the final result makes any kind of semantic sense in the application domain.

One can write custom conflict resolution and treat the data structures as a convenient baseline for event sourcing, but that requires a lot of work and potentially often user guided resolution.

I'd really love to see some research into deriving CRDT merge semantics from a formal description of application behaviour.

pvh · on Jan 31, 2023

In fact, convergence is a very easy property to preserve in all distributed systems. The trivial but technically valid version of convergence is to throw away all the writes and always return an empty document. A "last writer wins" version at the document level is what you get from a blob store like S3, but while it does converge, it's not that great either.

What we probably want from a distributed system is useful convergence properties that preserve the intent of the participants. A CRDT might not be a good fit for a bank account: if we can both withdraw the last $20 from my account, the bank will be upset. On the other hand, it's a pretty great way of combining independent observations into a list: it doesn't matter what order the observations arrive. Easy!

Most CRDTs aim to preserve causality: if I see your change, and then make my change, my new value will win. If we both make changes without knowing about each other, that's a conflict.

Of course if we both edit unrelated fields -- maybe it's not a conflict! At least, that's how we handle it in Automerge.

In the most conservative case, we should never merge data automatically. Most systems have unmodeled constraints. For example, sometimes a `git` merge will produce no conflicts but fail to compile anyway. Git's model (another CRDT) doesn't model program behaviour, nor do we expect it to. In this case, we rely on a combination of our experience, programming tools, and git's version history tooling to figure out what went wrong.

The conclusion I have is that a CRDT should give us robust tools for minimizing conflict, but also needs to be able to explain how things got to be the way they are and what you can do to make them how you want.

We've made a decent amount of progress on this in Automerge and have a paper coming up about this problem soon but I agree there's still more distance to go. If there are particular questions you have about merge semantics, I'm all ears! We'll continue to explore this space for the foreseeable future and I love to hear about new questions.

The last thing I want to add is that when you say "CRDTs are only really useful for a narrow subset of data", you're really drawing a lot of conclusions all at once about other people's needs and interests. From my perspective, CRDTs are useful for a lot of kinds of data. Not everything certainly, but from where I sit, perhaps more kinds of data than a limited single-node relational database and more kinds than a POSIX file which doesn't retain any history at all.

josephg · on Jan 31, 2023

> From my perspective, CRDTs are useful for a lot of kinds of data.

Yep I 100% agree.

I think the highest value uses for technology like this are in creative applications. I think about wikis, blogs, shared whiteboards, music production and video editing. In all of these cases, "referential integrity" (database constraints) don't really matter that much, and the working set is usually pretty small.

Sketch was outcompeted by Figma because figma used a CRDT as its backend, which enabled it to be collaborative. Sketch had an arguably better product, and was first to market. But it was stuck in the single-editor model because they didn't have a tool like automerge.

As for conflicts, increasingly my favorite CRDT for "general purpose" data (JSON trees) is MVRegisters. In the case of a conflict, a MV (Multi-value) register stores all of the conflicting values. But the application doesn't have to care - we can still treat it like a "single writer wins" register.

To make this work, the CRDT provides two APIs: a simple API and a complex API:

- The simple API just gives the application "the current value". In the case of concurrent edits, the system quietly chooses a winner. This is enough for most software most of the time. Its certainly enough to get started.

- The complex API returns all current values when a conflict has happened. Applications further along in their development lifecycle can use this API to present conflicts to the user and ask the user what should happen. (Or the application can resolve the conflict itself using application-specific logic).

The nice thing about this approach is that the data itself doesn't have to change. Its just an application / UI change to show conflicts. So collaborative applications can be written without caring about conflicts (at first). And later, when conflicts between multiple users cause problems, the applications can move to a richer API if they want to. (And remember, it all works like git under the hood anyway. We can store the full history so even when conflicts are resolved in a weird way, you still haven't lost the users' original edits.)

pharmakom · on Jan 31, 2023

Your mention of Git reminds me of CI and makes me think of a general strategy:

1. Allow the user (of the CRDT library) to define a fitness function that should be minimised

2. When multiply valid merges are possible, pick the result according to the fitness function

crabmusket · on Jan 31, 2023

> Most CRDTs aim to preserve causality: if I see your change, and then make my change, my new value will win. If we both make changes without knowing about each other, that's a conflict.

I haven't kept track of CRDTs since I worked with them in ~2015 and having read the paper by Shapiro et al, but I thought a casual description would be more along the lines of "once we both receive each other's changes, we will agree on the final state"? Or does that no longer reflect current state of the art, or was I just mistaken at the time?

lll-o-lll · on Jan 31, 2023

Would you say that automerge is useful for applications that don’t involve a human? I’m imagining a cluster of “service registry” services that use automerge as a way to manage shared state between them. There wouldn’t be a human to fix a merge conflict, so all possible merge outcomes would need to be well defined.

The CRDT examples I see are all oriented around human collaboration, are they a bad choice for something more akin to a distributed database?

pharmakom · on Jan 31, 2023

There is a talk about using CRDT across a server cluster to maintain a social media “like” counter

refset · on Feb 6, 2023

> I'd really love to see some research into deriving CRDT merge semantics from a formal description of application behaviour.

I'm no expert, but I believe synthesis approaches are a strong step in that direction: https://arxiv.org/abs/2205.12425

ghoomketu · on Jan 31, 2023

Sorry if this is a stupid question but how do I write the json changes to a MySQL database via Php? From what I can understand the json is being updated on the client side via javascript.. but to save the changes do I have to send the entire json doc to server or is it possible to somehow patch the json on php side also? i want to create a google docs type autosave functionality so sending the whole json seems quite wasteful.

I'm learning web programming and this seems quite useful for what I'm doing. Any tips on how to do it in php with this?

1123581321 · on Jan 31, 2023

Use the sync methods, which send, receive and merge packages of changes until all local instances of the data agree that they are caught up. These packages are intelligently created by Automerge and balance the amount of data transferred against how quickly all instances are consistent. Your PHP server app would keep track of all the local instances and pass the sends to them. You have a lot of freedom in how you implement this so long as you process the messages.

Eventually you’ll want to implement this using web sockets for performance, using a system like Laravel Broadcasting.

You can also package the total state JSON as a compact binary and store it in your server app, but you wouldn’t use that file to keep clients synced.

https://automerge.org/docs/cookbook/real-time/

francislavoie · on Jan 31, 2023

I have a usecase where I think this would be useful, but it's the PHP backend producing the changes to state, not the front-end. I have a test runner that runs async on the server via a job system, and I want to sync the state to the front-end. This means I'd have to produce the diff in the backend.

What are my options for that? There's no PHP library for that, it seems. Is their goal to have someone build a PHP C extension (or FFI) to call down to their library? That seems not very fun, because it's somewhat less portable than having a pure PHP implementation (even if it might be less performant).

1123581321 · on Jan 31, 2023

You’re right, there doesn’t seem to be a PHP SDK yet. This is unholy, but perhaps you could execute it in a node environment with v8js. https://github.com/phpv8/v8js

Otherwise I think you’d be looking at a headless browser in the test runner.

renke1 · on Jan 31, 2023

I am currently using yjs. What would be the equivalent way as described here [0] for yjs to sync docs in Automerge? I don't need any WebSockets or real time stuff. It always seemed so complicated in Automerge compared to yjs. I just want to roll my own simple sync mechanism via HTTP.

[0]: https://docs.yjs.dev/api/document-updates#syncing-clients

OJFord · on Jan 31, 2023

Looks pretty much the same with different words?

https://automerge.org/docs/cookbook/real-time/#changes-inter...

mateusfreira · on Jan 31, 2023

Great improvements, performance was probably the major concern of automerge prev versions, I have begin reading about CRDTs and this is the correct movement field at the moment.

Local first in my mind means your users can still work even if your server is down for a small amount of time for any reason, which means the users can also jump on an airplane and decide to review/update their price table or check out their sales commissions if they want. It benefits both sides of the software economy (users and providers)[1].

I recommend the reading the paper from 2019 about local first it is not too academic and gives a good view of the challenges [2].

[1] https://mateusfreira.github.io/@mateusfreira-2022-12-04-my-t... [2] https://martin.kleppmann.com/papers/local-first.pdf

quartz · on Jan 31, 2023

Excited to see this!

I built a personal project with automerge 1 recently because I liked the philosophy of offline-first from the team and because the docs were frankly much more approachable than yjs but ended up switching to yjs half way through because of performance issues and also for the rich text support (via the delta doc type).

A little bummed about 2.0 not working with react native because of webassembly but excited to see the peritext work progress for rich text coming soon.

CRDT is one of those technologies I assumed was “done” a decade ago and I was surprised to learn how much of the major progress was only just recently made when I dug in.

Definitely expecting to see some cool new multiplayer startups built on this tech.

riverdweller · on Jan 31, 2023

Would be great to have a CDN-based JS distribution for those who want to play without the heartache of JS build systems (npm/yarn/webpack/etc).

tluyben2 · on Jan 31, 2023

For now we use pouch/couch for this purpose, which does this (merge docs automatically and pick a winner) out of the box, but has the disadvantage of having to run couch which is an infrastructure pain. We have been exploring substituting it with crdt and this release seems to be the sign of maturity we needed to get us over the line.

LAC-Tech · on Jan 31, 2023

I think you're misunderstanding how couch/pouch works.

It doesn't merge documents automatically - it deterministically picks a "winning" version of the data, but the winner is completely arbitrary. You need to look at the conflicting versions to properly resolve conflicts, otherwise you're pretty much just rolling a dice and saying "yeah that version of the data will do whatever". There's no actual merging going on.

MrBuddyCasino · on Jan 31, 2023

> Cloud software is fragile and prone to outages, rarely supports offline use, and is expensive to scale to large audiences.

Hint to whoever wrote this: except the offline thing, this has not been my experience at all. Automerge sounds cool enough as is, no need to make up reasons why Cloud Bad.

MayeulC · on Jan 31, 2023

Both y-crdt and Automerge are rust libraries (though the latter seems to have a lot more targets), is there a short rundown/vs comparison somewhere? This article compares performance with yjs, but not y-crdt, and there are probably other comparison points to be studied.

LAC-Tech · on Jan 31, 2023

This is really cool. Theres a large class of applications for which deterministic conflict resolution makes a lot of sense. Having a mature library available and not having to implement your own versions from research papers is great.

At least in theory.. anyone used this?

x-complexity · on Jan 31, 2023

This is awesome, especially the massive improvements to performance: The fact that a CRDT file can be within 2x the size of a plain text file whilst still fully loading within 2s is a great sight to behold.

chaxor · on Jan 31, 2023

Can you cram a duck DB in this? Or maybe SQLite? I suppose you can export several tables into jsonlines files, but it might be cool to have a cleaner solution on the backend.

wizzard0 · on Feb 2, 2023

The very very hard part is merging across schema changes (i.e. ALTER TABLE on one machine, concurrent inserts/updates on another)

I’ve made a schema-aware CRDT previously that can chew thru concurrent JSON schema change, doing the same for SQL is certainly doable but a huge amount of work

therockhead · on Jan 31, 2023

Is advised to use CRDTs for an offline app that syncs to a central server, such as a todo app like Apple's Reminders or Todoist? Can simpler methods suffice ?

jamil7 · on Jan 31, 2023

You can if you like but, with a centralised server, you can do all the merging logic in one place without a CRDT. With a CRDT, you'd be storing the entire history of a document, which might be overkill for something like a todo app that's just doing last write wins.

LAC-Tech · on Jan 31, 2023

I hate to give a vague answer, but it really depends.

What do you think a todo app should do when it detects a conflict/concurrent update?

pharmakom · on Jan 31, 2023

Is there a way to hard delete old history in CRDTs? I’m thinking about legal and privacy requirements.

CGamesPlay · on Jan 31, 2023

I know this is possible in Yjs: replace the key itself with a new instance of a text CRDT, and populate it with the latest value. Such a change will destroy any concurrent edits, however (concurrent changes will be overwritten by the new instance of the CRDT upon merge). A more complex solution is garbage collection, which depends on the internals of the CRDT. I don't think Yjs exposes this for specific edits in text fields.

Automerge... the same approach would work but Automerge advertises itself as storing the full history, so I think the history of the root object would leak the data. I am not sure if it's possible to erase such history with Automerge.

pharmakom · on Jan 31, 2023

This is a big problem!

I like full history most of the time, but there are situations where hard delete is a... hard requirement.

MzHN · on Feb 1, 2023

There is Antimatter https://braid.org/antimatter

jchook · on Jan 31, 2023

Any insight for why they choose CRDTs over Operational Transform?

AFAIK Google Docs uses OT, for example.

mkl · on Jan 31, 2023

OT needs an authoritative server to coordinate things, but CRDTs can be purely peer-to-peer.

EGreg · on Jan 31, 2023

I loved Automergr and they chose all the right tech. But they disclosed a huge caveat: the system got massively slower as more operations were done on the CRDT. Is that still the case or was it fixed in 2.0?

pvh · on Jan 31, 2023

The article has quite a few performance numbers in it. The short answer is that it's much, much faster but that we will continue to pursue improvements pretty much forever.

brunoqc · on Jan 31, 2023

with CRDTs, can you do something like "discard old revisions, like after 1 year", to make it more efficient?

wizzard0 · on Feb 2, 2023

definitely, but depends on the implementation and usage (e.g. you might want to still keep the “created date”) or “yearly snapshots”, so - not automatic

codeptualize · on Jan 30, 2023

Link isn't working (extra /), https://automerge.org/blog/automerge-2/

llimllib · on Jan 30, 2023

yikes! no idea how that happened, I'll delete and resubmit

edit: there's no way to delete a submission? weird

dang: any way you'll see this and fix the URL?

dang · on Jan 31, 2023

You submitted https://automerge.org/blog/automerge-2/. But that page has https://automerge.github.io//blog/automerge-2/ as its canonical URL (note the extra slash).* The canonical URL redirected back to automerge.org, still with the double slash. I've fixed the URL now.

* Our software canonizes URLs when it can. I suppose we could make sure the canonical doesn't 404 before doing that, but it's a bit tricky when there's an additional redirect etc.

pvh · on Jan 31, 2023

Thanks. We'll fix that before the next one.

llimllib · on Jan 31, 2023

Thanks for fixing!

satvikpendem · on Jan 30, 2023

There is a delete button in one's submission history I believe, so for you it'd be https://news.ycombinator.com/submitted?id=llimllib

llimllib · on Jan 30, 2023

no delete button there for me

layer8 · on Jan 30, 2023

You can’t delete anymore once there are comments, I believe.

dang · on Jan 31, 2023

That is correct.

satvikpendem · on Jan 30, 2023

Ah, well dang might fix it if he sees this, or you could email them too.

alixanderwang · on Jan 30, 2023

you could just resubmit, and leave a comment here pointing there. i'll look for it in the new section to upvote!

dqpb · on Jan 31, 2023

> you can just think of it as a version controlled data structure. Automerge lets you record changes made to data and then replay them in other places

This is not my understanding of what a CRDT is. From Wikipedia:

> [CRDTs] send their full local state to other replicas, where the states are merged by a function which must be commutative, associative, and idempotent. The merge function provides a join for any pair of replica states, so the set of all states forms a semilattice. The update function must monotonically increase the internal state, according to the same partial order rules as the semilattice.

For those people saying there is no such thing as conflict-free, there is! But only for datatypes that satisfy these constraints.

munhitsu · on Jan 31, 2023

Any clue when the team will be releasing the swift version? I've been experimenting with the 1.x release to use in a Markdown editor but I had to revert for now to using my own CRDT implementation. Mainly because of speed (caching is hard but useful) and API limits (e.g. need to update cursor, and index text in documents). Still I love the abstraction of Automerge that I can treat as a black box container that I can just trust to do the right thing in the most optimal manner. It's a very nice abstraction.

ref to my implementation: https://github.com/munhitsu/CRAttributes https://github.com/munhitsu/CRAttributesDemo

jasmer · on Jan 31, 2023

Can someone answer how we go about the single-source-of truth problem in this distributed scenario? Or does this approach guarantee consistency among all synced data so that they are all effectively the same?

I can see how this would be used for 'live data sharing' but what about for more persistent information, like documents, designs etc?

LAC-Tech · on Jan 31, 2023

The term used in the CRDT papers is "strong eventual consistency" - basically it's an eventually consistent system with the added guarantee that any two replicas that have received the same updates - in any order - will have the same state.

So as for documents etc, if you can find or come up with a CRDT where the automatic merging function will give you something that makes sense for a user - sure.

satvikpendem · on Jan 31, 2023

> Or does this approach guarantee consistency among all synced data so that they are all effectively the same?

Correct, this is what CRDTs are for, eventual consistency.