Hacker News new | past | comments | ask | show | jobs | submit login
Pijul: A Rust based distributed version control system (pijul.org)
209 points by philonoist on Oct 17, 2018 | hide | past | favorite | 46 comments



I think the more interesting thing is that the data model of pijul is a successor / refinement to the types of ideas found in Darcs. (They even cohosted some hack events a few years back?)

Either way, the model of patches they have is pretty neat last I looked. Definitely a good read.


My company used Darcs for several years roughly around 2006-2009. It was great.

One of the magic aspects of Darcs was that picking a commit from another branch would include all the other commits that it depended on, which meant it was much easier to migrate logical changes between branches. With Git, if you're not merging/rebasing all of a branch, you have to manually untangle everything.

Darcs also pioneered the interactive "hunk record" interface, which Git eventually added as "git add -p".

Darcs' main problem was performance. Back when we were using it, it had an edge case where the merge algorithm sometimes went into exponential time, and it could take hours for it to resolve a conflict. It tended to happen at the worst possible time, and of course nobody could hack Darcs to fix it -- even the main author of Darcs seemed to struggle to find a solution. To my recollection, when Darcs 2 came out with (apparently) a fix, Github was already happening, and switching over to Git was a no-brainer. Git was considerably worse than Darcs, and it still isn't nearly as user-friendly, but on the other hand, I've never lost any productivity time to Git bugs.

Pijul is inspired by Darcs and uses a similar "theory of patches", but it doesn't suffer from Darcs' issues, last I checked. Rust also has more mindshare than Haskell, so it's got a higher chance of being adopted.


It's worth noting that Darcs is seriously considering just porting Pijul's algorithm into their next version to fix some performance issues.

http://blog.darcs.net/ and search for "Pijul" to find multiple mentions of this.

> Rust also has more mindshare than Haskell, so it's got a higher chance of being adopted.

I'm... uh... not so sure about this. The Haskell community is quietly pretty big these days. It's not that Rust isn't growing rapidly (and is growing fast due to consistent direct capital infusion from Mozilla), but as of the first half of 2018 it certainly felt like a smaller and more specialized community than the rather large surface area Haskell covers.


I don't mean the size of the community, but that Rust has an edge as far as being "mainstream-adoptable" and more mainstream-friendly overall.


I used darcs in about the same period, and in a proper distributed fashion, the ability to cherry pick a whole feature from a random repo was magical. Sadly, performance issues were common enough to make it painful to use compared to the alternatives.


Yeah, the logic behind it is based on patch theory (a subset of category theory) as put forth in this paper: https://arxiv.org/abs/1311.3903

It's a really interesting idea.


The title needs to be fixed to just "Pijul". The word "Rust" does not appear on the linked page, nor is Pijul "Rust-based" in any meaningful sense. It is software that happens to be implemented in Rust but could be implemented in any other language. If Pijul is "X-based" for any value of X, that X would be "category theory" (though that term doesn't appear on the page either).


It’s an open source project. Maybe the implementation language is mentioned for hackers (this is HackerNews afterall) who might want to contribute?


True, but the submission guidelines (https://news.ycombinator.com/newsguidelines.html) say: "[...] please use the original title, unless it is misleading or linkbait; don't editorialize". Adding the programming language to the title is editorializing (or more probably fishing for the Rust camp's upvotes). The submitter is free to comment on their own submission and editorialize there.


Nothing about this title is false or even disputable. It's a Rust-based version control system.

Please take your language hate elsewhere, it's not constructive here. If you have a problem with Rust or its community, wait for an appropriate thread to editorialize here rather than invoking the rules in a way that's contrary to their common interpretation or enforcement, as if you've got some sort of authority.


>language hate

Parent poster raised the possibility that OP's motivation for editorializing in the title was to make it more attractive to a set of readers. How exactly does that count as "hate"?

HN generally discourages titles that differ from the source materiel. OP added the bit about Rust. You're deflecting.


> Parent poster raised the possibility that OP's motivation for editorializing in the title was to make it more attractive to a set of readers. How exactly does that count as "hate" these days?

Why is additional factual data "editorializing"? Why is it a problem to say, "Oh and hey this is written in Rust?" What's actually immoral about engaging people capable of reading Rust to learn more about revision control?

This is Hacker News. Everything here is for the edification of folks interesting in software, its venues and applications. That seems to be good for posts to do. Folks who can read Rust will have extra reason to check out a worthy project, and perhaps look at how it works. Disseminating that kind of knowledge is precisely the kind of positive outcome sites like this look to drive.

I think it's "hate" because the only people mad about mentioning Rust so far have directly cited the idea that it's unfair to engage the Rust community on this site. It's difficult to even pick through the layers of this statement, it's so densely networked with bizarre assumptions. Thus, in my opinion it's language hate because the complaint is not, "This story would not be relevant without mentioning Rust," the complaint is, "This story's title shouldn't mention an interesting and important point." It very much seems like the goal is to suppress mentioning Rust, not to enforce a bizarre and unconventional reading of the rules.

> OP added the bit about Rust. You're deflecting.

I'm not deflecting that at all. I'm challenging the assertion that it's a rules violation or "editorializing". I've been here quite awhile and engaged with the mods a whole bunch, and I've never once seen "salient technical info" be marked as "editorializing." The Pijul folks make no secret of their use of Rust, and they're happy to talk about why they chosen it and why they like it.


>>> as if you've got some sort of authority

If I had the authority to change the title, I would have done it. As it is, I only claim the "authority" to point out that the submitted title should be changed in accordance with the guidelines; something that others often do as well.

> a bizarre and unconventional reading of the rules.

The rule as quoted is "[...] please use the original title, unless it is misleading or linkbait; don't editorialize". I read that as an actual rule ("use the original title") with a qualification (the clickbait part) and an additional statement about editorializing that only reinforces the rule. We can drop that last part if you like. In that case, the rule is (still) "use the original title". How am I reading this in a bizarre and unconventional way?

> Folks who can read Rust will have extra reason to check out a worthy project, and perhaps look at how it works.

Yes. And if OP had added the bit about category theory, then folks interested in category theory would have extra reason to check Pijul out. If OP had added that Pijul is on a Darcs-like concept of an algebra of patches, folks interested in Darcs would have extra reason to check it out, etc.

Picking and choosing what additional -- even factual -- info to add or not to add to the original title is edito^H^H^H^Hexactly what the guidelines say that a submitter should not do.

> The Pijul folks make no secret of their use of Rust, and they're happy to talk about why they chosen it and why they like it.

A page about that would make a great HN submission. The page submitted here is not that page; as I pointed out, it never mentions Rust anywhere.


> If I had the authority to change the title, I would have done it. As it is, I only claim the "authority" to point out that the submitted title should be changed in accordance with the guidelines; something that others often do as well.

Yes, it's clear you would.

> How am I reading this in a bizarre and unconventional way?

Because you're suggesting that mentioning the technology used for the project is "editorialization" or "clickbait."

Quite frankly, everything you've done here has been done not with the intent of actually fixing a problem, but rather of escalating the argument about how mentioning Rust is unfair (although unfair to what? It's unclear) or a rules violation (which if it is, it is at such a mild edge of that spectrum that it's an extremely common type of infraction). I cannot imagine what sort of outcome you were hoping for, but it appears you didn't get it.


> "editorialization"

I suggested that we drop this part if it makes you uncomfortable and focus on the core of the rule: "use the original title".

> "clickbait."

Something that appeared in text that I quoted, but which I did not suggest applied here.

> unfair

I did not use this word. You are arguing in extremely bad faith.

> a rules violation

Yep. The rule, again, being "use the original title".

> I cannot imagine what sort of outcome you were hoping for

In a thread that literally starts with the statement of the outcome I was hoping for? Here it is again: 'The title needs to be fixed to just "Pijul".'

I hope you have a wonderful day.


The mods have been here. In this very thread. They have elected not to change the post.

We can only conclude that if this is an infraction of the rules, it's not sufficient to invoke moderator interaction. I hope that this point is clear to you.

What's more, this happens dozens of times a day here on Hacker News, yet I cannot find even a single other instance of you leveling this complaint even as you comment in threads that commit this infraction.

So clearly, even you don't believe it should be universally enforced. You only perked up to this rule when Rust got mentioned.


Rust is new and seeing stuff implemented in it is value on its own. It also might be an indication of the maturity of the language and it might be an indication that the fact that the authors chose a niche language is because their goals with this project aligns with the programming language. I'd say that a VCS is a good fit for rust.

If this was written in javascript I wouldn't have been interested.


Guys I found the grammer nazi. Geez chill out man.


Can you please stop breaking the guidelines? We ban accounts that won't.

https://news.ycombinator.com/newsguidelines.html


What did the parent comment say that you disagree with?


One can be accurate while still being unnecessarily pedantic.


The reference manual seems to need some love:

https://pijul.org/manual/reference/log.html Is it pijul log or pijul changes? Also I see a huge block of ~30 nested blockquotes below the "Usage" title that probably shouldn't be there.

https://pijul.org/manual/reference/patch.html The subtitle "Output the compressed encoding of a patch (in binary)" seems misleading, since that only describes "pijul patch --bin"?

Why is "pijul record" not called "commit" instead? That's the equivalent, right? It seems all other commands have very familiar names, but this one is just different to be different?

https://pijul.org/manual/getting_started.html#definitions Same for "a pristine, which is a representation of the current recorded version of the repository". What's the "normal" word for that? The definition doesn't really make sense to me.

https://pijul.org/community/ The "Getting started" section contains no information about how to get started. Why no link to https://pijul.org/manual/getting_started.html?



I mean this with love, but the project would possibly get more attention if there were regular, automated releases. The 0.10.x release is not really rebuiladble without patching cargo files which has made it hard for me to package it for nixpkgs/nixos.


The database used by Pijul: https://docs.rs/sanakirja/0.8.16/sanakirja/

Cool project on it's own. A "nosqlite" :)


Has the code quality improved? I remember looking at it a couple years ago and finding it unreadable. Types had impl blocks all over the place, and it was hard to untangle. It made me lose interest in it.


> nothing is really reversible

Will Pijul ever support something like git's interactive rebase? I find myself using that constantly. One common flow for me: make some commits, make a final commit bumping my package version, realize I forgot something, make another commit, git rebase -i to move it before the version bump, git push.


As I understand it this isn't really a meaningful operation to Pijul (or Darcs, or patch-theory-based version control in general). The version bump commit and the final commit don't depend on each other so they're not inherently ordered.


What if they do depend on each other though? E.g. you added a changelog but spelt its filename wrong or something.


The backing algorithms figure that out and include all the necessary patches, applying them in an order that works.

Without understanding the theory behind it, it sounds a lot like magic, but apparently it works.


That does sound like magic. If I made the following three commits in order, how would it figure out that the last commit depends on the first commit but not on the second one?

+ import foo

+ import bar

+ foo.hello()


That's not the sense of "depends on" that applies here. It's a textual relationship- in this case, I believe the last commit would depend on the second, which would depend on the first.

If you're familiar with CRDTs, they're closely related. Patches store enough information to commute, so a given state (e.g. a branch) can be described purely by a set of patches.


I also use got commit -i semi-regularly, but for this specific case, git commit--amend --no-edit (or possibly without no-edit, if you want) suffices.


Here's what I wrote on reddit 5 months ago, when 0.10 was released (the edit is from then too ; TL;DR, it's not delivering what it promises wrt speed just yet):

I'd report the following on the nest... if it didn't alternate between not responding, HTTP 500 and HTTP 404.

I wanted to give pijul a try, and see how it scales. So I took the mozilla-central mercurial repository, and started applying the initial changesets from there.

The first mercurial changeset only added a .hgignore file, so that went smoothly. The second imported the entire source of Firefox at the time the repository was created. Which, by then, was only 245MB of data. `pijul add` was fast, but `pijul record -a` took a while, and crashed with an IO error... which turned out to be because I created the repository under `/tmp` and pijul filled all the 7.1GB that were free there. After moving the repo to some other place, it turned out to take more than 1 minute to record that change, and 10GB of temporary space, which went down to 1.8GB once done. That's a lot of disk space for 245MB of raw data.

Then I went with the third mercurial changeset, and `pijul add` went quickly. Before doing a `pijul record`, I wanted to see how `pijul status` looked like, and it ran for 12 minutes without a single line of output yet before I decided I had waited enough and ctrl-c'ed... Then I went with `pijul record -a` directly, and as of writing, it's been running for 10 minutes and hasn't finished yet.

The home page says: > Pijul started as an attempt to fix the performance of darcs, and ended up among the fastest distributed version control systems.

Maybe that's true for small repositories, but apparently, it scales really badly.

Edit: I did another attempt with a different method, importing each of the 24373 files from the second mozilla-central mercurial changeset with a separate `pijul record`. After 160 files, each `pijul record` was taking > 0.5s ; after 260 files, each was taking > 1s ; after 340 files, each was taking > 2s ; after 440 files, each was taking > 3s ; after 510 files, each was taking > 4s. It took more than 10 minutes to get there, and I stopped.


The patented "Shlemiel the painter’s algorithm" [1].

1: https://www.joelonsoftware.com/2001/12/11/back-to-basics/


> nothing is really reversible

This is a dealbreaker for me if it means what I think it does, which is that you can never change history to pretend the order of commits was different from what it was in reality.

I understand that certain people's preferred source control workflow involves keeping around every little intermediate commit. That's fine. But a tool should not impose workflows on me, or prevent me from modifying my data in the way I personally choose.

Imagine if vim didn't let you save files with words spelled wrong, because its authors didn't agree that that was a valid way to write prose.


I think you misunderstood what it means. If I understood it correctly, the datastore is append only, but nothing should prevent you from modifying history and pushing a new head.

I sure hope that there is an equivalent of git-filter-branch, though. In case someone commits SSH keys or secrets accidentally, just overwriting them and leaving them in the database is not good enough.


> In case someone commits SSH keys or secrets accidentally, just overwriting them and leaving them in the database is not good enough.

git-filter-branch is likely not the solution.

Revoke the keys/secrets. Then maybe rewrite history too, but afterwards.


I don't think you can ever really reverse anything in git either. Reflog has saved my ass a few times.


Reflog entries expire (after 30 days by default) and they're local only.


`git gc` throws them away. You probably want to `git config --global gc.auto 0` to truly rely on that behavior (its default value is currently 6700 [approx. # of loose objects]).


> Imagine if vim didn't let you save files with words spelled wrong, because its authors didn't agree that that was a valid way to write prose.

This would just be another tool that works for some and not for others.

If the tool doesn't suit your needs, don't use it. It doesn't mean people shouldn't innovate or attempt new things because it doesn't solve your specific problem sets.


I'm unable to build from cargo install pijul

the first error is error[E0277]: the trait bound `T: rand_core::RngCore` is not satisfied .. this could be related but doesn't help https://nest.pijul.com/pijul_org/pijul/discussions/291


new cool VT100 feature of Windows 10.

The trouble with git is the terrible user interface. That's what needs attention.


I can't find the info anywhere. Where is the code of the Nest? Will I be able to run my own instance?


looks like the source code for Nest is not public:

https://nest.pijul.com/pijul_org/nest




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: