More

schacon · 2025-04-07T22:31:53 1744065113

This is all fairly speculative, but I didn't get the impression that Monotone was a main inspiration for Git. I think BitKeeper was, in that it was a tool that Linus actually liked using. Monotone had the content addressable system, which was clearly an inspiration, but that's the only thing I've seen Linus reference from Monotone. He tried using it and bailed because it was slow, but took the one idea that he found interesting and built a very different thing with that concept as one part of it is how I would interpret the history between these projects.

emmelaich · 2025-04-08T00:41:53 1744072913

Linus was definitely aware of and mentioned Monotone. But to call it an inspiration might be too far. Content Addressable Stores were around a long time before that, mostly for backup purposes afaik. See Plan9's Venti file system.

schacon · 2025-04-07T22:28:29 1744064909

I'm curious why you think hg had a prominent role in this. I mean, it did pop up at almost exactly the same time for exactly the same reasons (BK, kernel drama) but I don't see evidence of Matt's benchmarks or development affecting the Git design decisions at all.

Here's one of the first threads where Matt (Olivia) introduces the project and benchmarks, but it seems like the list finds it unremarkable enough comparatively to not dig into it much:

https://lore.kernel.org/git/Pine.LNX.4.58.0504251859550.1890...

I agree that the UI is generally better and some decisions where arguably better (changeset evolution, which came much later, is pretty amazing) but I have a hard time agreeing that hg influenced Git in some fundamental way.

jordigh · 2025-04-07T22:36:39 1744065399

[flagged]

schacon · 2025-04-07T23:05:51 1744067151

"One particular aspect that often gets left out of this creation myth, especially by the author of Github is that Mercurial had a prominent role." implies to me that Hg had a role in the creation of Git, which is why I was reacting to that.

For the deadnaming comment, it wasn't out of disrespect, but when referring to an email chain, it could otherwise be confusing if you're not aware of her transition.

I wasn't sponsoring hg-git, I wrote it. I also wrote the original Subversion bridge for GitHub, which was actually recently deprecated.

https://github.blog/news-insights/product-news/sunsetting-su...

pseudalopex · 2025-04-08T00:22:49 1744071769

> For the deadnaming comment, it wasn't out of disrespect, but when referring to an email chain, it could otherwise be confusing if you're not aware of her transition.

I assumed it was innocent. But the norm when naming a married woman or another person who changed their name is to call them their current name and append the clarifying information. Not vice versa. Jane Jones née Smith. Olivia (then Matt).

Epa095 · 2025-04-08T07:44:30 1744098270

> Please don't do that. Don't deadname someone.

Is this not a case where it is justified, given that she at that time was named Matt, and it's crucial information to understand the mail thread linked to? I certainly would not understand at all without that context.

jordigh · 2025-04-08T16:38:15 1744130295

The proper way to do that is say, something like "Olivia (Matt)" and then continue. You use the preferred name, and if you need to refer to the deadname to disambiguate, you do it.

If you can avoid the need to disambiguate, you do that too. The name really is dead. You shouldn't use it if at all possible.

hitekker · 2025-04-08T03:02:58 1744081378

Wait a second. You're saying now hg didn't influence git, but how does that fit with your previous comment?

> One particular aspect that often gets left out of this creation myth, especially by the author of Github is that Mercurial had a prominent role

I'm not sure where you're getting your facts from.

jordigh · 2025-04-08T16:40:08 1744130408

Mercurial had a prominent role in the creation myth. It didn't influence git, but it was there at the same time, for the same reason, and at one time, with an equal amount of influence. Bitbucket was once seen as fairly comparable to Github. People would choose git or hg for their projects with equal regularity. The users were familiar with both choices.

Linus never cared about hg, but lots of people that cared about git at one point would also be at least familiar with some notions from hg.

schacon · 2025-04-07T21:43:43 1744062223

Good pull. I was wondering if that was a true statement or not. I am curious if Linus knew about that or made it up independently, or if both came from somewhere else. I really don't know.

schacon · 2025-04-07T21:42:24 1744062144

Ah yes. It was pretty cool that when Peepcode was acquired, Pluralsight asked me what I wanted to do with my royalties there and was fine with me waiving them and just open-sourcing the content.

It also is a testament to the backwards compatibility of Git that even after 17 years, most of the contents of that book are still relevant.

schacon · 2025-04-07T21:37:28 1744061848

You can use smudge and clean filters to expand this into something on disk and then remove it again before the hash computation runs.

However, I don't think you would want to use the SHA, since that's somewhat meaningless to read. You would probably want to expand ID to `git describe SHA` so it's more like `v1.0.1-4-ga691733dc`, so you can see something more similar to a version number.

schacon · 2025-04-07T21:34:19 1744061659

You can probably setup smudge and clean filters in Git to do keyword expansion in a CVS-like way.

schacon · 2025-04-07T21:33:42 1744061622

You'll be the first to know when I write it. However, if anything, GitHub sort of killed the mailing list as a generally viable collaboration format outside of very specific use cases, so I'm not sure if I'm the right person to do it justice. However, it is a very cool and unique format that has several benefits that GitHub PR based workflows really lose out on.

jen20 · 2025-04-07T23:05:30 1744067130

By far my biggest complaint about the GitHub pull request model right now is that it doesn't treat the eventual commit message of a squashed commit (or even independent commits that will be rebased on the target) as part of the review process, like Gerrit does. I can't believe I'm the only person that is upset by this!

schacon · 2025-04-07T23:08:39 1744067319

If this is something you're interested in, you may want to try the patch-based review system that we recently launched for GitButler: https://blog.gitbutler.com/gitbutlers-new-patch-based-code-r...

jen20 · 2025-04-08T06:06:58 1744092418

This does look interesting! I’ll take a closer look.

bhasi · 2025-04-08T05:10:46 1744089046

You are not alone. Coming from Gerrit myself, I hate that GitHub does not allow for commenting on the commit message itself. Neither does Gitlab.

Also, in a PR, I find that people just switch to Files Changed, disregarding the sequence of the commits involved.

This intentional de-emphasis of the importance of commit messages and the individual commits leads to lower quality of the git history of the codebase.

schacon · 2025-03-24T12:35:29 1742819729

I think it's more interesting to build tooling that helps with the core problem rather than try to change people's attitudes. The problem imo is not that people are lazy, but that the review tooling is not good at the problem set of large changes that need to be introduced atomically.

Even if a change is not enormous, it can still be difficult to grok the context of a change when it's all squished together. Reviewing a series of smaller changes with good context messages is just mentally easier and technically simple (plus ends up with better commit messages for future context).

I've seen Butler Requests come in via our internal usage where the unified diff is still fairly small, but it's much easier still to review it properly when it's broken up into semantically grouped changes with good commit messages.

mtlynch · 2025-03-24T12:58:09 1742821089

Hi, Scott! Thanks for your response.

To be clear, I like the work you're doing. I definitely agree that the unit of review shouldn't be one commit at a time or every commit all at once, and I'm happy to see you invest in tooling that offers a better path.

>I think it's more interesting to build tooling that helps with the core problem rather than try to change people's attitudes. The problem imo is not that people are lazy, but that the review tooling is not good at the problem set of large changes that need to be introduced atomically.

I think it's both, but I think mindset shift has to precede tooling in this case.

If the team/org culture tolerates people just blindly LGTM'ing large commits, that's a culture problem at its core. Even if they have better tooling, what's motivating people to invest extra work to break up changes into small, logical chunks rather than continuing to just throw huge diffs over the wall if those get LGTM'ed with no pushback?

schacon · 2025-03-17T11:53:21 1742212401

If you want us to cover something on Bits and Booze, just let me know! :)

schacon · 2025-03-17T09:14:24 1742202864

These days if you do a blobless clone, Git will ask for missing files as it needs them. It's slower, but it's not broken.

jakub_g · 2025-03-17T10:58:34 1742209114

Maybe I was doing something wrong, but I had a very bad experience with - tbh don't remember, either blobless or treeless clone - when I evaluated it on a huge fast-moving monorepo (150k files, 100s of merges per day).

I cloned the repo, then was doing occasional `git fetch origin main` to keep main fresh - so far so good. At some point I wanted to `git rebase origin/main` a very outdated branch, and this made git want to fetch all the missing objects, serially one by one, which was taking extremely long compared to `git fetch` on a normal repo.

I did not find a way to to convert the repo back to "normal" full checkout and get all missing objects reasonably fast. The only way I observed happening was git enumerating / checking / fetching missing objects one by one, which in case of 1000s of missing objects takes so long that it becomes impractical.

schacon · 2025-03-17T15:54:34 1742226874

The brand newest version of Git has a new `git backfill` command that may help with this.

https://git-scm.com/docs/git-backfill

jakub_g · 2025-03-17T17:48:29 1742233709

Nice timing! Thanks!

p_wood · 2025-03-18T10:41:18 1742294478

For rebasing `--reapply-cherry-picks` will avoid the annoying fetching you saw. `git backfill` is great for fetching the history of a file before running `git blame` on that file. I'm not sure how much it will help with detecting upstream cherry-picks.

jakub_g · 2025-03-18T19:41:09 1742326869

Oh, interesting! Tbh I don't fully understand what "--reapply-cherry-picks" really does, because the docs are very concise and hand-wavy, and _why_ it doesn't need the fetches? Why it is not the default?