Hacker News new | past | comments | ask | show | jobs | submit login

> Git manages the size of the repository more efficiently; while the Mercurial repository has been approaching 50M in size, the Git repository is only 17M.

17MB vs 50MB - almost a third of the size. That is definitely quite impressive.




I’m not the biggest git fan and I still think it’s right choice, but how is this even a reason to switch? Unless you go in (older) Subversion ridiculousness how are 40MB of disk space even a concern? They even listed it first. At best it would be Oh, and we saved 40MB, the size of two raw pictures from a digital camera.


The size issues of the SQLAlchemy repository come from the way Mercurial handles copies and renames.

I prefer Mercurial because is much easier to use but this file rename issue always make me feel uncomfortable when reorganizing code.

These days I'm giving Fossil a try, which still is easier to use than Git and the repository size sits between Git and Mercurial.


From the other side of the mirror, I mean from the position of someone used to git, this comment seem weird.

Git is not hard to use. It is adding a few articulations in your workflow, and they are just allowing you to run faster.

One example: interactive staging with git add -p, this articulation masks it much easier to debug: add print all over the place, try some tweaks, find the one, stage this one snippet, checkout the files, run the test, and you're done.


When you consider that distributed repositories are going to be cloned dozens, hundreds, or thousands of times, it starts to add up.

Even if this larger project could handle that bandwidth, it is a significant factor for smaller projects, or larger projects like github. Meaning more of a chance that git remains the dominant choice.


The repo is hosted by third parties, bandwidth is not an factor (it's all text in any event so the compression factor is huge). There are actual differences worth talking about, file size is not one of them.


git repositories aren't giant masses of text files; they're compressed on disk and the additional gains from compressing them with lzma are minimal (12 MB on a 306 MB git repo I have lying around). I assume Mercurial does something similar as the difference would be a lot larger than 17 vs 50 MB if not.


it's the time it takes to clone. Also with git I need to clone a lot less since I can create local feature branches that I can delete if they are abandoned.

edit: quick speed test, git clone from my server = 17.4 seconds, hg clone from the same server's hg repo = 25.4 seconds


Time to clone is not directly a function of absolutely repository size.

Both Git and Mercurial use hardlinks for local clones by default, so the reason why Mercurial is slower than Git for local clones is primarily that its repository data generally contains more files.

Cloning/pulling/pushing speed across a network is in large part determined by the protocol used; a few years ago, Git's network performance was inferior to that of Mercurial [1]. I believe that has been largely fixed since then.

Finally, Mercurial allows you to have local feature branches that you can delete if they are abandoned just fine.

[1] https://code.google.com/p/support/wiki/DVCSAnalysis -- footnote 1


Your citation is comparing the "dumb http" protocol, which was replaced several years ago with the smart http protocol in git-1.7. Dumb http was only ever provided as a method of last resort. The git:// and ssh protocols have always been fast.

Another data point: when we converted to Git, I did a number of speed comparisons. Our repository was 77MB in Git versus 178 MB in Mercurial. Clone time from bitbucket over either (smart) http or ssh was 18 seconds with Git, versus 2 minutes with Hg. We can do a shallow clone (--depth 1) in 4 seconds (10 MB transferred) with Git, but Hg has no comparable feature.


The speed was also slower for the native protocol if you read the footnote to the end. Also, as I noted, this was years ago. I was making a point about cloning/pulling/pushing speed being dependent not on just the repository size, not about the relative superiority of one or the other tool [1].

[1] I find both Mercurial and Git adequate, but lacking in some aspects that are important to me (both with respect to architectural design and workflow considerations). For practical work, I consider the differences between Mercurial and Git to be relatively minor in comparison and cannot really get exercised over them.


Is it really helpful to provide numbers that you know are many years out of date? If you're really interested in this, try to import a big repository like, say, the Linux kernel using the latest version of Mercurial and see how it compares to the latest version of git.


I was making a point about what factors influence clone performance in general, not trying to contribute to the tedious Git vs. Mercurial debate. If I had found any data about, say, Gnu Arch vs. Codeville (or some other abandoned codebase), I would have used that instead.


Mercurial has "hg clone --uncompressed", which really cuts down clone time at the expense of bandwidth.


I would assume people cloning the repository on slower connections.


If that was an actual issue (I doubt it is), they could have done a oneoff repack server side (reordering changesets to optimize compression).

Edit: I think the third point would have been enough, network effect for free software is important. Size (reorder your repo if it matters), branching (use bookmarks, not named branch unless you know what your are doing), and history rewriting (use evolve) are dubious points.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: