I've tried several times over the past couple of months to read articles or introductions to git. The one thing they seem to have in common is that it looks like they're having a contest on who can come up with the most outrageous ways of saying how bad other version control systems are. Anyway so I read through this whole website, but again this one fails to say: what does git do that others can't? Specifically, what is better about git than Subversion? There's the 'distributed' aspect which in some specific scenarios is nice. There seem to be some niceties like adding files to a commit one by one and doing a 'final' commit only at the end; that's a pain to do with the commandline subversion client (but it's really easy with Tortoisesvn).
So, is there a concise explanation somewhere of what makes git better than subversion?
Branching works. I know it works in theory in SVN, but in practice one often runs into problems with any non-trivial branch. With git (or really DVCS), branching typically Just Works.
As a recent convert to git (from hg), who didn't "get it" before, I think you have to qualify what you mean by branching works.
Effective branching is the killer feature of git or any source control system, to me. And in that regard:
svn < mercurial < git
svn: Three way merges are close to impossible. When you do a three way merge of a file you need to find the common ancestor of all three versions. This is just an incredibly difficult problem in svn that to my knowledge can't be solved in a way thats durable. In mercurial and git you can simply walk the graph to the common ancestor.
mercurial: Somewhat fails at the concept of a short lived local branch. You can't create a local branch in place--without cloning the entire repository to a different folder--to try out some feature and then delete the branch if it doesn't work out. At least not without using bookmarks or extensions: mq, localbranch. With these you get a bookmark, a patch queue, or an in-repository clone, respectively. None of these are branches in the fundamental sense.
git: You can get a short lived local branch, a long lived one, a release branch, a bug fix branch, a feature branch, branch, branch, branch. In git branching just works, and that, to me, if compared to mercurial, means specifically working local branches that are not fundamentally different to the rest of the repository.
The killer combination for me is branching + distributed. Your branches don't have to be mirrored on a remote server for all to see.
Branching & merging are also insanely easy to do, so I find that I develop almost all new features in their own branch, and I can easily just throw them away if I want, I never did this with subversion.
What are those problems? Why are they less with git? I often make branches with subversion; what goes wrong is keeping track of what features/changesets need to be merged across which branches, plus 'dependencies' of those changesets (changes in previous commits to the branch that add e.g. a class definition that the changesets depends on). Does git improve on that? How?
What he really means is that branches are now a part of your daily workflow. You can sync up to head (or wherever you pulled your working copy of the repository from), create a branch, and start working on some feature on that branch, mucking up a hundred files along the way. Now say some some urgent issue comes up that you need to fix immediately -- you can easily go back to the point before you branched, and create a new branch from there on which you can fix the issue. Meanwhile, your previous branch still exists, so when the urgent issue is fixed you can resume where you are working on. Another great thing about branches is say you're working on some feature in a branch and you come to a point where you can implement something multiple ways, but don't know which one will turn out elegant. You can create a new branch from that point in your current branch, and if it doesn't pan out, revert it to where you were.
The kicker with Git is that all these branch operations take on the order of milliseconds because Git stores the entire project history -- all past revisions of files, etc -- locally. Sure, you pay in a bit of disk space, but consequently almost all common operations can be done without hitting the network and are crazy fast. Even when you execute a commit, you're not committing to a server, but to your working copy of the repository. Later, you can push your changes somewhere else and merge accordingly.
Furthermore, git stores the entire project history locally and compressed in such a way that a live git project (checkout+entire history) often takes LESS space than an equivalent live subversion project (checkout * 2, which subversion does for allowing you to diff without having to go to the server).
Obviously, if you add a 1GB random file then delete it, the git entire history will have 1GB more data than an SVN checkout. but usually, it's smaller.
I've switched to using gitsvn exclusively for svn projects I'm involved with since I noticed that -- I save space and have complete project history locally. What more could you ask for?
(p.s: git svn is a better svn client than svn. really)
Here is my favourite drawing illustrating one of the problems with Subversion:
|
| file p.txt contains "X"
|
+<---(branch)--------+
| |
* file p.txt renamed |
| to f/p.txt, which |
| contains "X" |
| * the content of file p.txt is updated to
| | contain "Y"
| |
| |
+----(merge)-------->+ file p.txt is deleted and file f/p.txt is created
| with the content "X"
|
| BUT expected/wanted content is: "Y" (or at least a conflict)
V
Branches are usable only when a merge is not unnecessarily painful. SVN's lack of ability to track renames is a big pain especially in languages like Java where the filename corresponds to code itself. SVN is capable of leaving the codebase broken after a merge.
Others mention the super-fast branching. One person mentioned partial add/commits. Someone mentioned the size difference (it is HUGE... your checkout of one SVN revision in a large repository will be six times as huge as your entire git repo and checked out files!). There is history revision (split a commit, squash some commits together, reorder them, separate them into different branches, deploy what you want! it's like editing a document!). The user interface is helpful and informative if you understand git lingo and version control theory; for example, if you misspell a command, git intelligently suggests commands that are similar to what you meant. On these things alone it blows SVN out of the water.
But to pull a page from Linus's playbook, since no one mentioned it: git fsck. No other SCM I am aware of can fix itself in the event of a small corruption. I have on rare occasion had to deal with these corruptions on network shares, and a "git fsck" brought me back to life in a jiffy. This is really useful for recovering work on your unshared local branches.
While there are a lot of lists that proclaim what's better about git -- at some level you're right: you can do the same things in subversion than you can do in git. (Even though svnmerge doesn't really do nearly the same things as git merge.)
That being said, I've witnessed the main gains of git after using it for a while. It is no longer a chore to branch and merge, it's instantaneous. It takes no time to commit. There is no penalty to committing -- my commits don't make it to everyone else until I'm ready to send my commits. This means that in a git environment I find myself committing, branching and merging more often.
Over time, you find that the increased number of commits and branches helps you to collaborate with others in ways that you just weren't used to thinking about in subversion.
If you are always working on things very linearly and are never context switching (and your peers aren't context switching), then you will notice no change with git. However, I have never worked at a place where this is the case.
This isn't really an explanation of why git is better than subversion overall, but it was a pretty big eye-opener for me, and I'm definitely still learning the basics of git: http://tomayko.com/writings/the-thing-about-git
To sum up, git solves what Ryan Tomayko calls "The Tangled Working Copy Problem" with amazing ease. If you have a working copy filled with changes, git add --patch lets you pick and chose which files, and even which changes within a file, to commit. It's obviously possible to do this manually with svn (or even cvs) but git just makes it so easy.
Merging, cherry picking, scalability/speed, rebasing, synchronizing multiple remote repositories, no need for commit "rights", and github are my reasons for preferring git over svn...
Fast because every clone is a local copy of the complete repository with all commits and branches.
Small because of the way it stores it's diffs.
Subversion tends to be really slow when doing branching, merging and getting commit logs because it always needs to communicate with the remote server to transfer files and data.
Subversion checkouts seem to be really huge because it will have a version of the latest head and a version of what you are currently working on.
Imagine if when you made a Git branch and then attempted to merge back to master that you got errors saying that it could not complete. Even after trying all your Git Foo. In other words, that you would now have to checkout the main trunk of the project and manually merge your changes (by backing up your failed branch with a file system copy). Yes, svn merge and svn reintegrate usually work but there are times where the entire process completely fails and makes you want to do shots at 3pm on a Monday. (Good luck if someone deleted files you have in your private branch!) You Git developers have it good. Branching is like breathing to Git, while it is like a dental checkup in SVN. I prefer Git and avoid svn branching since it is like touching a hot stove. So basically I have to keep up with the SVN trunk and I cannot merge WIP because I am not on a private branch. To keep me sane.
Well, here's why distributed is better than centralized:
You're working on something in SVN. You've made progress that you'd hate to lose, but it's not great yet. Which do you do?
1) Don't commit until it's great. This prevents bugs from hurting your colleagues, but puts you at risk of losing work or being unable to undo mistakes - in other words, it's just like not having version control.
2) Commit a lot. Opposite problem - your teammates get your bugs.
Distributed version control lets you have both benefits: commit locally as often as you want, and push it up when it's good.
(I think I stole this explanation from Joel Spolsky. Whoever said it, it was what made me realize I should learn a distributed VCS.)
P.S. If your answer to the dilemma above was "use a branch until you're ready," well, OK, but branching in SVN is painful and slow, requiring making a copy of every file in the repo.
The whole distributed concept really works. No more "WiFi is down, so I cannot commit". As well rebasing is a nice tool which saves a lot of headache, especially for people trying to figure out what you did a lot later...
How is this an advantage? Whether or not I can commit locally does not make much difference to migrate my changes into a central repository for other developers to see...
It's an advantage because you can make commits that are the same size and scope as the ones you'd normally make when connected. You don't have to save up for one big giant commit that has a bunch of unrelated changes.
Offline commits also make it easy to switch contexts quickly: to go from adding a new feature to fixing a bug you just noticed.
It could be due to Linus creating git for use on the Linux kernel. It's a huge project, and if the source control system can work for him it can work for others?
A a well designed and simple GIT Tut this seems to be easy to run through. In addition, the steps are broken down into micro chunks which are always easier to people who have no clue. Wonderful and thank you.
So far I only heared that git does not handle large binary files well and supposedly it is good to keep your large source tree (larger than the Linux-kernel !?) into many smaller git-repositories.
And you have to learn something new and unlearn bad habbits.
It doesn't handle large binary files well, but would you track patches on them anyway? It doesn't fit git's model. Git deals with the current working state of the whole repository, not individual files.
A major problem with git is that if you try to use it without realizing its fundamental model is different, it will seem awkward and complicated. Don't think about it as "like svn, but distributed"; start from zero.
For some projects you also want to track versions of binary files because they go together with the code. And I read it was a bad idea to use git for those binary files, exactly because git was not designed for that.
So, is there a concise explanation somewhere of what makes git better than subversion?