Curious as to why - What usecases do you have for staging? Why does sourcetree m...

acemarke · on Nov 23, 2018

Incremental staging of the next commit is absolutely a valuable feature.

I'll often touch multiple files over the course of a few hours, sometimes for slightly unrelated changes, and then want to pull apart the changes into several smaller logical commits. Sometimes a couple hunks in a file might need to go into Commit A, while others go into Commit B.

You can do piecemeal adding of hunks via the CLI, but the interface is horrible. Being able to simply click "Add Hunk" in a GUI is incredibly valuable (or even in some cases shift-clicking a couple lines and "Add Lines").

ufo · on Nov 23, 2018

You can get that kind of interface without a staging area. TortoiseHg would do it using Mercurial's shelving feature (similar to git stash) and IMO that was simpler to understand than staging. In particular, I liked that at all times there was a correspondence between the current working directory and what was being commited. It made it easier to run tests.

chousuke · on Nov 23, 2018

I would hate to be restricted to a GUI tool to achieve what git add can do from the command line. Using stashes would be just reimplementing the index using less convenient UI.

I really don't understand what's so difficult about the index... It's just the stuff you will be inserting into the repository when you next commit. Having it separated enables a very convenient workflow that would've required manually using patch and diff when using tools that don't support you.

Git is more than just revision storage. I like to think of code as clay, and the index as a tool you use to mould that into the final construct that gets baked

jobigoud · on Nov 23, 2018

> I would hate to be restricted to a GUI tool to achieve what git add can do from the command line.

I don't quite understand the use of the command line for git or hg.

In my typical workflow before I commit I want to quickly review all the changes I just made. With the GUI you have a list of files and when a file is selected a diff of that file, without opening a new window. That means you can browse all the diffs in a few seconds just by moving the cursor along the changed files. If you see an unrelated change that doesn't have an impact you can just uncheck these lines to remove these changes from the commit. Or uncheck the file itself if all its changes are unrelated. I feel I'm much more confident of what ends up in the commit than using the cli.

chousuke · on Nov 24, 2018

I often find GUIs restrictive because they don't lend themselves well to ad-hoc scripting. I also don't use GUI editors, so that may be a part of it.

I can see how a TUI/GUI for being able to quickly stage things from the UI is useful, but it doesn't make the concept of the index useless on the command line either.

Using git from the command line is second nature to me at this point, and the index is a large part of my workflow. When I use git for revision control (as opposed to eg. Subversion. Or anything that doesn't have lightweight branching, something akin to an index, and rebases), it feels like the tool is helping me organize my commits instead of just being a place to shove things after I'm done coding.

Shorel · on Nov 26, 2018

What you describe is achieved by using git diff, and then git commit -p.

I use that everyday. Whatever you imagined we command line people use, it is just not accurate, IMO.

maxxxxx · on Nov 23, 2018

Sounds like how I work.

quietbritishjim · on Nov 23, 2018

I have a different workflow in TortoiseHg than ufo. In your terminology, I only consider revisions that have been pushed as being baked, while unpushed revisions are the clay. We already have tools to manipulate revisions, while the index, stash/shelve and patches are all just reimplementations of revisions with less convenient UIs. I just go ahead and commit whenever I like (without having to stage!), and sort it out afterwards with a combination of update (like git checkout), revert on specific files (like git reset --soft), and rebase (like git rebase!).

You might say this is trading one complexity (staging area) for another. But you need to be familiar with these commands anyway, so you might as well use the same things for other tasks. Plus, they're visualised on the same revision graph as everything else (e.g. "oh look, one month ago I saved that private commit").

Mercurial has a few properties that make this easier than in git. For example, there is no concept of detached head / garbage collection; when you save a commit, it is simply saved forever unless you choose to forcably remove it. (I have never found myself wishing for Git's refs and heads; they are just straight up unnecessary.) But I could imagine a git wrapper that had these properties too (e.g. when I checkout an old revision, it automatically creates a new branch with a special name that refs the commit I'm leaving behind).

(PS: Having said all of this, 90% of the time I just go ahead and commit and push everything, and 90% of rest of the time it suffices to just untick the checkboxes next to files that I don't want to commit before clicking the commit button in TortoiseHg. This is massively easier than any of these strategies, and it's opt-in. I'm sure there is a command line way to do this but I find a GUI is great for this task because it's quite visual.)

chousuke · on Nov 24, 2018

> Mercurial has a few properties that make this easier than in git. For example, there is no concept of detached head / garbage collection; when you save a commit, it is simply saved forever unless you choose to forcably remove it. (I have never found myself wishing for Git's refs and heads; they are just straight up unnecessary.) But I could imagine a git wrapper that had these properties too (e.g. when I checkout an old revision, it automatically creates a new branch with a special name that refs the commit I'm leaving behind).

I don't get it. What does mercurial have instead of refs?

HEAD in git is just one kind of a ref; differentiated from a branch only by when the tooling updates it (ie. when you create a commit in the checked out branch, or when you check out another commit)

"branches" are mutable references. "remote" branches are just refs pointing at the commit that was the remote branch the last time you fetched it to your local repository. They're all the same thing in the end, though.

The way Mercurial branches seem to be a special thing instead of a property of the structure of a repository is partially why I prefer git.

In git, the fact that a "named branch" is actually just a ref to a particular commit that gets updated occasionally makes perfect sense to me. The way mercurial does it has never clicked; why does a branch have to be something special?

I mean, in git, a branch that has no ref pointing to it is still a branch, but if you have no reference to a commit, why would its existence matter? Sure, you can end up "losing" commits when working with rebase, but no-one actually leaves their repo that way if the commit was important; you just look it up from the reflog and give it a name again, and it won't get garbage collected.

And after a branch is merged, why would you keep around a named reference to a branch head that no longer matters? IIRC mercurial has a feature to "close" branches, but I don't understand why that even needs to be a thing? In git, you just delete unnecessary branches (that is, refs), and you're done.

Maybe I misunderstand how mercurial works, but I've not found a source explaining the rationale behind why it works this way.

quietbritishjim · on Nov 25, 2018

For Mercurial, you need to totally abandon the idea that a branch is stored by reffing a commit, and the branch is implicitly defined as the ancestor commits of that head. Instead, every single commit has a branch name baked into it as a simple string; a branch is defined as all commits whose branch name matches that name. To open a new branch, simply make a commit with a branch name that hasn't been used before. By default, the branch name of a commit is the same as its immediate ancestor.

When you switch branch in Mercurial, in principle it simply looks through all commits to determine the latest one with that branch name. In practice I'm sure there is a cache of this, which is the closest there is to git refs, but that is totally transparent. All commits must have a branch name, and the first commit’s branch name is usually "default" (equivalent to git's "master"). That close branch feature you mentioned is a special type of commit that stops that branch from being included in the list of branches, but the branch still exists because all previous commits to it still do.

This has a few consequences compared to Git's branches, which may be good or bad depending on the situation and your personal opinion. One is that a commit cannot be a member of more than one branch. Another is that branches are far less mutable than they are in Git; to change the branch that a commit is in you must actually destroy and re-create that commit (e.g. with rebase).

Another consequence is that it is fine for a branch to have multiple heads; just go back to a slightly older revision and commit again. Or, more commonly, pull after making your own commits and find someone else has committed to the same branch. (If you prefer: find that someone else has made some commits with the same "branch name" metadata.) Usually this situation is only transient because you would merge or rebase. Indeed, you cannot create multiple heads for a branch on a remote repo unless you force push, so you'd normally merge/rebase before pushing.

Another bit of commit metadata is state: public, draft or secret. Commits are draft when you create them, essentially meaning "not pushed". Public commits are ones that have been pushed to or pulled from a remote repo. In git, to see what things look like on the remote, you’d use the remote ref; in Mercurial, you’d look at the public commits. Admittedly you lose the ability to understand which remote repo a commit was seen on, whereas git supports multiple remotes. (During a push, Mercurial will still check whether “public” commits are on a remote repo and push them if necessary.) You can mark a commit as “secret” and then it will not be pushed unless you change its state back. This is what I use when I commit something for my own reference later on, which I was talking about in my previous comment. Often when I rebase, I leave the original commits behind and mark them as secret in case something goes wrong.

I can see you think branches are "something special" in Mercurial. Maybe in Git you could use refs to implement something totally different to branches, whereas in Mercurial you're stuck with that exact implementation. But refs seem like an unnecessary implementation detail to me. I never want to implement something similar but not quite the same as branches; I just want branches to work! And by not having any refs whatsoever the system seems quite a bit simpler.

chousuke · on Nov 28, 2018

Thanks for the explanation. I don't think it's a good idea, but it does clarify some things.

In git, a branch is a branch by virtue of being an actual physical branch in the data structure.

To elaborate, I think a branch having multiple heads is not really a good thing, since that leads to the actual structure being one or more branches (each point of divergence results in a branch in the commit lineage), but only one name to refer to the collection of commits making up those branches. That abstraction doesn't agree with me.

Since a branch (a lineage of commits) is unambiguously defined by its head commit, having a named ref as the branch abstraction makes more sense to me. I don't know how it could be any simpler.

Certhas · on Nov 23, 2018

It makes sense as a tool that should be available. It does not make sense as a default hoop everyone has to jump through. The way that stash, index, working directory and branches interact is completely non-obvious and induces a lot of accidental complexity

As far as I can see the use case you describe is actually covered in gl commit.

cyphar · on Nov 23, 2018

"git commit -a" will skip staging entirely if that's what you'd like to do. Should "-a" have been the default? Maybe, but that doesn't mean that you can't use it.

Now, if you want to exclude certain files from a commit I will freely admit this is a pain. The easiest way, especially if you have loads of untracked files you don't want to commit, is to do:

    % git commit -a && git reset HEAD~ <file> && git commit --amend

Which is pretty awful, I will admit. I had an alias to do this a while ago (without the need to amend the commit) but it wasn't great fun.

However staging is still a useful concept, and "gl commit" as far as I can tell doesn't really allow me to do something I do quite often ("git add -p" to add partial hunks of a change to staging for a commit). I recognise that I'm probably in the vast minority (outside of the kernel community) when it comes to my Git usage, but staging is definitely quite important for some Git usecases.

doubleunplussed · on Nov 23, 2018

What is tortoisehg doing when you check/uncheck files for commit? I think you can even check/uncheck individual hunks in the diff interface next to the changed-files list. I don't think it's shelving unmarked changes, the changes are still there on disk.

It looks like mercurial has a staging area, just that the default it to stage everything unless the user says so, rather than the other way as in git.

Funny that people are mostly unaware of mercurial supporting this - I would argue not using it by default it pretty user friendly, so long as you know it's there when you need it. I use it pretty often, but it's nice not having the boxes unchecked by default. I usually do want to commit everything.

Relatedly, totoisehg is the reason I don't use git for my personal projects. No git gui that works on Linux comes close.

vthriller · on Nov 23, 2018

I split file changes into multiple commits all the time. That alone makes many diffs much more readable and changes easier to follow, even if it results in occasional git-log clutter.

cyphar · on Nov 23, 2018

What is the redundant step? Do you not use "git commit -a" -- or does it not work in certain cases?

Staging is quite useful if you want to create several commits (perhaps fixup commits, perhaps whole commits) based on the current repository state. This is why you can do "git add -p" to add parts of files to staging.

You mentioned later in the thread that you prefer stashing -- but Git doesn't let you stash certain files (you have to stash the entire state), and in addition stashing has a much more awful UX than commits (it's represented as a stack which means you have to pop elements off it -- and you've then lost that stack entry and you can't pop the stack with uncommitted changes). If there is one aspect of Git's UX that needs to be massively redesigned it would be "git stash".

jayd16 · on Nov 23, 2018

You might not use it to do partial commits but many others do.

joppy · on Nov 23, 2018

Don't partial commits run the big risk of commiting code that may not work, since the partial commit on your own computer will be tested against your working tree, wheras after it's committed it run with a different view of the files?

Piskvorrr · on Nov 23, 2018

It could. OTOH, most workflows I've seen use the approach "do whatever you wish in your repo and in your branches, but merging requires that the tests pass on a CI server, which only has access to the repository and has exactly zero uncommited files.

In other words, go ahead and break your builds six ways from Sunday, but they won't get merged until they pass based solely on the state of the repo.

explainplease · on Nov 23, 2018

Not if you know what you're doing. For example, say you added a bunch of TODOs and FIXMEs you noticed in existing code while working on something. Now, you're ready to commit. The TODOs and FIXMEs don't necessarily correlate to the code changes you made. Should they all be committed with the code?

Using the index to stage and commit hunks separately lets you easily add and commit the unrelated TODO and FIXME comments separately from the code. A tool like Magit makes it trivially easy to do.

And since they're comments, they have no effect on tests or the code.

jayd16 · on Nov 23, 2018

In reality I use it to do partial stashes more often than partial commits.

That said, you could extend your argument such that you should always commit every touched file but that would be awful.

BalinKing · on Nov 23, 2018

Just to play devil's advocate, to achieve this always-working guarantee you could: stage, commit, stash your working directory, test your code, and then unstash.

kurtisc · on Nov 23, 2018

Most of the time I'm doing partial commits it's because of squashing them into older commits via rebasing. Rebasing allows you to run an arbitrary command (e.g. your test suite) after each step and will pause the rebase if it returns non-zero.

maxxxxx · on Nov 23, 2018

Maybe it depends on what you're working on. I often have to try multiple things to see what's working. During that process I slowly stage things that are already working and then move on to the next part. Before the final commit I often discard things that I changed but didn't turn out to be necessary.

Maybe there are other ways to do this but for me it's the perfect workflow.

randallsquared · on Nov 23, 2018

It is common for me to have miscellaneous changes I don’t want to commit, but I think this is problematic on my part, and I shouldn’t typically be doing it. The only defensible version of this I can think of offhand is changes that are necessary to get the repo to work in my local, but which I am not yet able or willing to commit and push.

saagarjha · on Nov 23, 2018

You can always use the stash for changes like this.

randallsquared · on Nov 23, 2018

It depends. If you have, as we do at my work, a pre-commit hook that runs acceptance tests on each commit, and if the changes you've made are to get it to work on your local, then leaving those changes in for the test run is really the only option.

Should I get those changes committed? I should. But that involves a lot more testing and thought, since it still has to work for other people with different local dev setups.