Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any chance you could outline the downside of subtree? I've always thought it was better in every way.


Submodule can be used when you want to nest a private repository inside of a public one. Only authorized users will be able to fetch from the submodule repository. This can be useful in some cases, such as storing private keys in an otherwise public project.

Also, if you have a very large submodule repository, it won't inflate the size of the repository that contains it.


git-subtree can flatten the other tree into a single commit while preserving the ability to use all its other functionality, which is actually probably a space-saving measure compared to submodules when you consider that anyone who uses your repo probably needs to get the submodule anyway, and that means downloading that other giant repo in its entirety vs. downloading a flattened copy of it.


So if I understand this correctly, it is more of a convenient way to copy (and update) another repository into another one, instead of being a reference to another repository.

To respond to the comment about this being useful for private keys: if this is the case, does this also mean unauthorized users will not be able to checkout the submodule at all? Or will they just get a copy of the files (which would not make sense with private keys).


When you git clone a repository with a submodule, before you call git submodule sync/update and all that jazz, you just have an empty directory where the submodule should be.


Subtree operations modify commits (and trees). As a result, cryptographically signing or validating a subtree operation does not extend to an independent module, and vice versa. If you care about these things, it's a big deal.


If you use git-subtree in the mode that doesn't flatten the original tree it does not, afaik, modify any commits on either side. The original tree, as imported, and the tree being imported to, are both parents of the post-subtree tree with their original SHAs.


Wasn't aware of that. That's kind of awesome.

You seem to be very knowledgeable about subtree, so I'll ask a question you can hopefully answer:

I have multiple repositories each having 100,000-500,000 files, all relating to the same project. The reason I broke them up to multiple projects is that the git index doesn't handle millions of files all that well (e.g. index gets rewritten on every modification). Would subtree help me here in any way?

Of course, this is not a usual way to use git, and Facebook opted for switching to Mercurial when faced with a similar problem (though an order of magnitude or two larger, I'm sure).

submodules work for me now, but they are very clunky -- but the reason I didn't even try switching was the crypto signing of commits, which is apparently not a problem. Would subtrees help me with the scale as well in any way?


No, I don't think subtrees can really help you with this unfortunately. At least not directly. Using the flattened mode might help, but then you'd be back to your problem with signing them. The subtree merge commit does record the original SHA, so I suppose you could verify as a separate step if you needed to that that merge commit is indeed the result of a correctly signed SHA somehow.

But I think for your use case you're probably stuck with submodules.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: