Hacker News new | past | comments | ask | show | jobs | submit login

> My biggest regret is not money, it is that Git is such an awful excuse for an SCM. It drives me nuts that the model is a tarball server. Even Linus has admitted to me that it’s a crappy design. It does what he wants, but what he wants is not what the world should want.

Why is this crappy? What would be better?

Edit: @luckydude Thank you for generously responding to the nudge, especially nearly instantly, wow :)




My issues with Git

- No rename support, it guesses

- no weave. Without going into a lot of detail, suppose someone adds N bytes on a branch and then that branch is merged. The N bytes are copied into the merge node (yeah, I know, git looks for that and dedups it but that is a slow bandaid on the problem).

- annotations are wrong, if I added the N bytes on the branch and you merged it, it will (unless this is somehow fixed now) show you as the author of the N bytes in the merge node.

- only one graph for the whole repository. This causes multiple problems: A) the GCA is the repository GCA, it can be miles away from the file GCA if there was a graph per file like BitKeeper has. B) Debugging is upside down, you start at the changeset and drill down. In BitKeeper, because there is a graph per file, let's say I had an assert() pop. You run bk revtool on that file, find the assert and look around to see what has changed before that assert. Hover over a line, it will show you the commit comments to the file and then the changeset. You find the likely line, double click on it, now you are looking at the changeset. We were a tiny company, we never hit the claimed 25 people, and we supported tons of users. This form of debugging was a huge, HUGE, part of why we could support so many people. C) commit comments are per changeset, not per file. We had a graphic check in tool that walked you through the list of files, showed you the diffs for that file and asked you to comment. When you got the the ChangeSet file, now it is asking you for what Git asks for comments but the diffs are all the file names followed by what you just wrote. It made people sort of uplevel their commit comments. We had big customers that insisted the engineers use that tool rather a command line that checked in everything with the same comment.

- submodules turned Git into CVS. Maybe that's been redone but the last time I looked at it, you couldn't do sideways pulls if you had submodules. BK got this MUCH closer to correct, the repository produced identical results to a mono repository if all the modules were present (and identical less whatever isn't populated in the sparse case). All with exactly the same semantics, same functionality mono or many repos.

- Performance. Git gets really slow in large repositories, we put a ton of work into that in BitKeeper and we were orders of magnitude faster for things like annotate.

In summary, Git isn't really a version control system and Linus has admitted it to me years ago. A version control system needs to faithfully record everything that happened, no more or less. Git doesn't record renames, it passes content across branches by value, not by reference. To me, it feels like a giant step backwards.

Here's another thing. We made a bk fast-export and a bk fast-import that are compatible with Git. You can have a tree in BK, have it updated constantly, and no matter where in the history you run bk fast-export, you will get the same repository. Our fast-export is idempotent. Git can't do that, it doesn't send the rename info because it doesn't record that. That means we have to make it up when doing a bk fast-import which means Git -> BK is not idempotent.

I don't expect to convince anyone of anything at this point, someone nudged, I tried. I don't read hackernews any more so don't expect me to defend what I said, I really don't care at this point. I'm happier away from tech, I just go fish on the ocean and don't think about this stuff.


> No rename support, it guesses

Git doesn't track changes yes, it tracks states. It has tools to compare those states but doesn't mean that it needs to track additional data to help those tools.

I'm unconvinced that tracking renames is really helpful as that is only the simplest case of of many possible state modifications. What if you split a file A into files B and C? You'd need to be able to track that too. Same for merging one file into another. And many many many more possible modifications. It makes sense to instead focus on the states and then improve the tools to compare them.

Tracking all kinds of changes also requires all development tools to be aware of your version control. You can no longer use standard tools to do mass renames and instead somehow build them on top of your vcs so it can track the operations. That's a huge tradeoff that tracking repository states doesn't have.

> submodules

I agree, neither submodules nor subtrees are ideal solutions.


> What if you split a file A into files B and C? You'd need to be able to track that too. Same for merging one file into another. And many many many more possible modifications.

I suppose Bitkeeper can meaningfully deal with that since their data model drills down into the file contents.


> You run bk revtool on that file, find the assert and look around to see what has changed before that assert. Hover over a line, it will show you the commit comments to the file and then the changeset. You find the likely line, double click on it, now you are looking at the changeset.

I still have fond memories of the bk revool. I haven't found anything since that's been as intuitive and useful.


I hadn't heard of the per-file graph concept, and I can see how that would be really useful. But I have to agree that going for a fish sounds marvellous.


I fished today, 3 halibut. Fish tacos for the win! If you cook halibut, be warned that you must take it off at 125 degrees, let it get above that and it turns to shoe leather.


That's exceptionally detailed answer. One thing I remember is how microsoft windows [0] had so much trouble while migrating to git

0. https://arstechnica.com/information-technology/2017/05/90-of...


What's a GCA?


Greatest common ancestor (merge base in git terminology):

https://www.bitkeeper.org/man/gca.html

https://git-scm.com/docs/git-merge-base


As someone who has lived in Git for the past decade, I also fail to see why Git is a crappy design. It's easy to distribute, works well, and there's nothing wrong with a tarball server.


Exactly. While the article is good about events history, it doesn't go deep enough into the feature evolution (which is tightly connected to and reflects the evolution of the software development). Which is :

TeamWare - somewhat easy branching (by copying whole workspace from the parent and the bringover/putback of the changes, good merge tool), the history is local, partial commits.

BitKeeper added distributed mode, changesets.

Git added very easy branching, stash, etc.

Any other currently available source control usually is missing at least one of those features. Very illustrative is the case of Mercurial which emerged at about the same time responding to the same need for the modern source control at the time, yet was missing partial commits for example and had much cumbersome branching (like no local history or something like this - i looked at it last more than a decade ago) - that really allowed it to be used only in very strict/stuffy settings, for everybody else it was a non starter.


Git is terrible at branching, constantly squashing and rebasing is not a feature but an annoyance. see fossil for how to do proper branching/merging/logging, by its very nature, Not to mention that by having the repository separate from the data, it forces you to organize it in a nice way (Mine look like Project/(repo.fossil, branch1/ branch2/ branch3/) You can achieve this with git now but I never had to think about it in fossil, its a natural consequence of the design.


>constantly squashing and rebasing is not a feature but an annoyance

it is a feature which allows, for example, to work simultaneously on several releases, patches, hot fixes, etc. Once better alternative emerges we'll jump the git ship as we did before when we jumped onto the git ship.

>the repository separate from the data

that was a feature of a bunch of source controls and a reason among others why they lost to git.

>it forces you to

that is another reason why source controls lose to git as git isn't forcing some narrow way of doing things upon you.

I don't deny of course that for some people/teams/projects other source controls work better as you comment illustrates. I'm just saying why git won and keeps the majority of situations.


> Once better alternative emerges we'll jump the git ship as we did before when we jumped onto the git ship.

It's not that easy at this point in time. git carries a lot of momentum, especially in combination with GitHub.

Anybody learning about software development learns about git and GitHub.

Software is expected to be in GitHub.

At the time git became successful there were arguably better systems like mercurial and now we got fossil, but git's shortcomings are too little of a pain point compared to universal knowledge about it and integration into every tool (any editor, any CI system, any package manager, ...) and process.


>It's not that easy at this point in time. git carries a lot of momentum, especially in combination with GitHub.

CVS back then was like this too, including public repos, etc.

>At the time git became successful there were arguably better systems like mercurial

I specifically mentioned Mercurial above because they both emerged pretty simultaneously responding to the same challenges, and Mercurial happened to be just inferior due to its design choices. Companies were jumping onto it too, for example our management back then chose it, and it was a predictable huge pain in the neck, and some years down the road it was replaced with git.


> CVS back then was like this too, including public repos, etc.

Not really.

CVS had too many flaws (no atomicity, no proper branching, no good offline work, etc.) Subversion as "natural successor" fixed some things and was eating some parts of CVS.

At the same time sourceforge, the GitHub of that time, started to alienate their users.

And then enterprises used different tools to way larger degree (VSS, sccs, Bk, perforce, whatever) while that market basically doesn't exist anymore these days and git is ubiquitous.

And many people went way longer without any version control than today. Today kids learn git fundamentals very early, even on Windows and make it a habit. Where's in the early 2000s I saw many "professional" developers where the only versioning was the ".bak" or ".old" file or copies of the source directory.


People started paying me to develop software in 1986. First time I ever used version control software was 1996. It was TERRIBLE. Two years later I left to start my own software company, but my experience with it to that point was so bad I went without version control the first few years. Around 2002 I started using CVS (or RCS? long time ago!) and quickly switched to Subversion. After learning git to work on Raku circa 2009, I switched my main $WORK repo to git in maybe 2012. Every repo I've created since then has been in git, but I still haven't moved all my svn repos over to git.


> (VSS, sccs, Bk, perforce, whatever) while that market basically doesn't exist anymore these days and git is ubiquitous.

Perforce still has a solid following in the gamedev space - even with LFS, git's handling of binaries is only mildly less than atrocious.


Yeah but market share shrunk a lot (especially since the market grew massively) and even Perforce is a tool integrating with git these days.


> it is a feature which allows, for example, to work simultaneously on several releases, patches, hot fixes, etc. Once better alternative emerges we'll jump the git ship as we did before when we jumped onto the git ship.

What are you talking about here? I'm not talking about eliminating branching, but the fact that merging a branch is usually just a fake single commit that hides away the complexity and decisions of the branch. see [0] into how you can leverage branches and the log for a sane commit history.

> that was a feature of a bunch of source controls and a reason among others why they lost to git.

Given the article, git won because it was foss, torvalds and speed, if you have proof of a good amount of people saying "I hate the division of data and repository!" then its a believable claim, or maybe you're confusing the data/repo division with cvs? git also didn't have to fight much, the only contender was hg

[0]: https://fossil-scm.org/home/timeline




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: