Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

CRDTs should be able to give you better merge and rebase behaviour. They essentially make rebase and merge commits the same thing - just different views on a commit, and potentially different ways to present the conflict. CRDTs also behave better when commits get merged multiple times in complex graphs - you don’t run into the problem of commits conflicting with themselves.

You should also be able to roll back a single commit or chain of commits in a crdt pretty easily. It’s the same as the undo problem in collaborative editors - you just apply the inverse of the operation right after the change. And this would work with conflicts - say commits X and Y+Z conflict, and you’re in a conflicting state, you could just roll back commit Y which is the problem, while keeping X and Z. And at no point do you need to resolve the conflict first.

All this requires good tooling. But in general, CRDTs can store a superset of the data stored by git. And as a result, they can do all the same things and some new tricks.



This is the key point. Once your data structure carries the full edit history instead of reconstructing it from DAG traversal, rebase and merge become different views of the same operation. Not fundamentally different operations with different failure modes.

The weave approach moves ordering into the data itself. That's the same insight that matters in any system that needs deterministic ordering across independent participants: put the truth in the structure, not in the topology of how it was assembled.


Afaik pijul already does that though


In theory, maybe. In practice… last write wins (LWW) is a CFDT operator, so replace every mention of CRDT with LWW and issues will more obvious.

Really though, the problem with merges is not conflicts, it’s when the merged code is wrong but was correct on both sides before the merge. At least a conflict draws your attention.

When I had several large (smart but young) teams merging left and right this would come up and they never checked merged code.

Multiply by x100 for AI slop these days. And I see people merge away when the AI altered tests to suit the broken code.


> In practice… last write wins (LWW) is a CFDT operator, so replace every mention of CRDT with LWW and issues will more obvious.

Yeah. A lot of people are also confused by the twin meanings of the word "conflict". The "C" in CRDT stands for "Conflict (free)", but that really means "failure free". Ie, given any two concurrent operations, there is a well defined "merge" of the two operations. The merge operation can't fail.

The second meaning is "conflict" as in "git commit conflict", where a merge gets marked as requiring human intervention.

Once you define the terms correctly, its possible to write a CRDT-with-commit-conflicts. Just define a "conflict marker" which are sometimes emitted when merging. Then merging can be defined to always succeed, sometimes emitting conflict markers along the way.

> Really though, the problem with merges is not conflicts, it’s when the merged code is wrong but was correct on both sides before the merge.

CRDTs have strictly more information about whats going on than Git does. At worst, we should be able to remake git on top of CRDTs. At best, we can improve the conflict semantics.


> CRDTs have strictly more information about whats going on than Git does. At worst, we should be able to remake git on top of CRDTs. At best, we can improve the conflict semantics.

That is a worthwhile goal, but remember that code is just a notation for some operation, it's not the operation itself (conducted by a processor). Just like a map is a description of a place, not the place itself. So semantics exists outside of it and you can't solve semantics issue with CRDTs.

As code is formal and structured, version control conflict is a signal, not a nuisance. It may be crude, but it's like a canari in a mine. It lets you know that someone has modified stuff you've worked on in your patch. And then it's up to you to resolve the probable semantics conflicts.

But even if you don't have conflicts, you should check your code after a synchronization as things you rely on may have changed since your last one.


being able to customize the chunking/diffing process with something analogous to an lsp would greatly improve this. In my experience a particularly horribly handled case is when eg two branches add two distinct methods/functions in the same file location (especially if there is some boilerplate so that the two blocks share more than a few lines).

a language aware merge could instead produce

>>>> function foo(){ ... } ===== function bar(){ ... } <<<<<<


If you haven't heard of it yet, Mergiraf uses tree-sitter grammars to resolve merges using syntax-aware logic and has a pretty good success rate for my work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: