I haven't used Gitless before but usually these types of abstractions fall short for me. It never helped me as much as truly learning git.
I learned so much about the commands that intimidated me by playing "Oh My Git!"[1]. It's an open source game that presents commands as quick-to-understand concepts and lets you experiment with them by using playing cards on the visual rendition of a repository. I was honestly surprised nobody else mentioned it here, maybe it's not that well known.
Of course it's not completely accurate all the time in its descriptions but it sure helps understanding the underlying workings a bit more, and it pushed me into actually reading the manual.
I have "truly learned git" (including having written a clone for a school project) and still hate the CLI and conventional Git workflow. Why is checkout overloaded to mean "restore a file from the index" or "move to a different branch" or "move to a different commit?" Why can reset mean both "un-add a file" or "delete a commit?"
I am very happy that there are better frontends for git like Gitless or Facebook's Sapling. I also think their commands are more of a 1:1 mapping to "raw" git operations behind the scenes.
As of the current git release (v2.41.0), there is nothing in the documentation for "checkout" indicating it is deprecated, while the documentation for "switch" contains the phrase -- in all caps, like this -- "THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE."
One presumes checkout will eventually be deprecated by switch, but it isn't yet.
I guess that warning is to prevent people from using it on a script because it’s not a stable command. But for normal shell usage, it’s the preferred method.
Preferred how? I get a warning when I use checkout? Who is teaching me it's preferred? Am I expected to keep up with a community newsletter to use my version control tools?
To be fair even if those commands are still overloaded there are new commands that aren’t overloaded. (git restore for example now has its own command).
This sounds like a great tool, but I'm not sure anything could push me to read the Git documentation. It's written by someone too brilliant for me. I prefer to find a Stack Overflow answer where someone has dumbed it down a bit.
Oh My Git! looks really cool and useful, thanks for mentioning it! I find it incredible that it's funded by the Prototype Fund (https://prototypefund.de/project/oh-my-git/), and it makes me wonder which other countries have similar funds hackers can apply to.
It's likely that they need to include the full std library in the windows version (build with something other than microsoft's c++ compiler, like msys g++ or similar)
Wow this is a timely post. Literally two days ago I thought "there should be an easier way to use git" and I went out and bought gitless.io ... I didn't even check what was on the .com!
To the author: I think I'll need a new name for my tool (I was imagining something fairly different and UI based) so if you want the .io just let me know.
I support this fully. Juniors and intermediate engineers I work with still struggle with git quite often and I personally know how that feels firsthand.
Not sure if this solves the issue. I personally recommend using git, but keeping it simple by limiting the allowed operations.
Note the absence of merge and and cherry pick. Most often when a junior has messed something up it seems to be because of merging in a weird order or cherry picking the wrong thing before merging or rebasing.
Either my juniors have stopped asking me for advice (totally possible) or this recommendation helps. As I find myself very seldom needing those commands I hope and believe it's the latter :)
I support it, but I also wonder why other teams seem to spend so much time on git than we do. Our setup is fairly basic, you have a main branch, you can only push things too it by squash merge pull request that are linked to a work item, and you can only do so after whatever you’ve build has gone through the release pipeline, which among other things involve spinning up an instance of it for POs or whatever to test. When a dev wants to change something, they pick a work item on our “mostly kanban” board, and create a branch from there with a dev/name. Behind the scenes this spins up a build and a deployment to a dev/name deployment slot whenever something is pushed to it. They do their things, other devs do theirs, similarly on dev/othername branches and then when whatever is complete it gets squash merged into the main branch with a direct link to the kanban work item.
You’re mostly free to do what you want on your dev/name branch. The only thing you really need to teach “juniors” is that they will hate themselves unless they merge main into their dev branches regularly. But if you’re heading home and want to commit something, or you simply commit as often as you click cmd+s, you can do it with no useful comment because it’ll get squashed anyway. Most people where I work use GitKraken, mostly out of habit, but I think more and more are switching to VSC extensions and some might even use the cli.
The result is that we spend almost 0 time on “git” in our organisation. I wonder if we do things wrong somehow since git seems to cause issues for so many people.
I guess it does depend on the size of your commits. Ours being relatively small (one work item) we still have the history, while devs getting the freedom to not have to make every commit meaningful.
What's helped me much more lately is undotree for vim [1]. It basically logs every single time a file is saved. Its much more useful because commits have to be made by humans and they may not do it often (and usually there is an incentive for "clean or working commits"). There have been many times where I went back to copy something from the undotree.
That means on average your project must have 50 users to have 1 git specialist. Git problems however seem to happen far more than for 1 in 50 developers I've worked with.
I use git merge a lot (others might too, perhaps only via the Git(Hu|La)b interface though?).
I use git cherry-pick very occasionally. I agree it's probably a mistake in most cases, but there are valid heavily constrained circumstances where it's fine.
I had never heard of git switch. It looks like just another git checkout?
Rewriting history with rebase is a destructive operation and I would never advocate for it to juniors. Merges are much more straightforward because they follow the git mental model of acyclical directed graph.
> Rewriting history with rebase is a destructive operation.
It's not, because Git history is immutable [0], you can't change it. But I understand that from the user perspective it appears to change, and the consequences for public history "changes" have to be known to prevent any troubles.
> It's not, because Git history is immutable [0], you can't change it.
> It is by design impossible to modify or delete an existing commit with regular Git commands.
These are both just straight up incorrect. Make the commit unreachable from all regular branches (this can be as simple as a reset or rebase - and at this point is effectively rewriting history), then gc. Oh, ah you say, there’s a reflog. But that can be pruned/expired with the very same regular reflog command you refer to in your post (or just change the unreachable expiry from 30 days to now). At that point the commit is truly unreachable and a gc will make it go away forever.
FYI the commands git reflog expire --expire=now --all && git gc --prune=now should do the trick about truly deleting commit objects for good.
Also the filter commands allow rewriting history.
This makes a difference because your wording implies that there is some inherent structural invariant that makes git history immutable - and there certainly isn’t, there’s just some default policy for how otherwise unreachable objects are recoverable.
Also I think when most people say (correctly [0]) that git history is mutable it is that by default it is easy to rewrite the history of any branch and basically manipulate the commit graph in any way, and all can be done with the standard command line tools. Bringing in the reflog is mostly irrelevant to the concept of history rewriting.
> But I understand that from the user perspective it appears to change
If you force push to a shared repo, there's no "perspective" about it - the history has changed, somebody else may basically get a completely unrelated graph to what they have in their own repo.
Bottom line: The person you're replying to is correct, rebase is a fundamentally destructive operation, because while the objects in git are immutable the references are not and that's all that matters - because a commit without a reference might as well not exist, it's not a history, it's just a commit. Starting with rebase you can prove to yourself that you can make the repo dispose of commit, tree and blob objects forever. I see the distinction that you are trying to make but I don't see why it is either pedagogically or practically a useful one.
Thanks for your input. You're right, my statement is not fully correct and I will also update my post. What I mean with "regular" are commands like "rebase" - just running "git rebase" is not destructive in a way that your previous changes are truly gone (by default). That's actually a pretty important thing to know both pedagogically and practically, as it makes it trivial to undo a rebase. If rebase was actually "destructive", you couldn't do it.
You've stated a lot of ways to delete commits that I don't see as "regular" (gc, changing default expire date, filter, reflog, ...). There are even more plumbing commands to delete commits. But yes, I should have explained what I mean with "regular" and also make my point less absolute.
> If you force push to a shared repo, there's no "perspective" about it - the history has changed.
It depends on what "history" and "changing" means to you. If it's the entire Git graph, then every commit changes the history (which it does). What I mean by change in this context is that I can take a graph and alter the existing commits (change order, move, change the content, etc.), and that's not what happens with a rebase for example. I should have clarified this as well. It comes down to the "branches are just pointers" discussion, so for me, moving a branch does not "change" the history (= the commit graph). But I also understand that from a different perspective - viewing the branch as the pointer + the commits it reaches - the history actually changes (but not in the way I've meant it). It's important to state which of those one is talking about.
> You've stated a lot of ways to delete commits that I don't see as "regular" (gc, changing default expire date, filter, reflog, ...).
If reflog is not regular then by your own set of constraints you’ve “deleted” a commit the moment it’s unreachable from any regular branch or stash - which is what most people think when they say “deleted.”
Deleting a line of code or a paragraph of text is still deleting even if I can undo it through the very system we’re discussing.
I don’t actually think gc is germane but it was to drive home the point that there’s nothing like cryptographic signatures or something that make objects undisposable.
Further a force push after rebase is destructive because there is no reflog on a remote bare repo.
The corollary is that force pushing deletes commits that are impossible to recover with any “regular” git command.
> That's actually a pretty important thing to know both pedagogically and practically, as it makes it trivial to undo a rebase.
Then just simply say things can be undone with the reflog.
> What I mean by change in this context is that I can take a graph and alter the existing commits (change order, move, change the content, etc.), and that's not what happens with a rebase for example.
> It's important to state which of those one is talking about.
Why?
> But I also understand that from a different perspective - viewing the branch as the pointer + the commits it reaches
I don’t understand what is “different” about this perspective, a branch is literally the pointer. A commit object is not a branch.
> moving a branch does not "change" the history (= the commit graph).
It will point to an entirely new commit graph. This is a change. The commit graph you refer to may very well be orphaned (ie deleted) at this point.
> Further a force push after rebase is destructive because there is no reflog on a remote bare repo. The corollary is that force pushing deletes commits that are impossible to recover with any “regular” git command.
That's somewhat true, there's a whole section about this in my post. But in that case it's not the rebase that is "destructive", it is the forced push. And even that is not destructive as in "unrecoverable". Those changes are not impossible to recover. I can undo my rebase and force-push it again. So can my colleagues working on the same branch. Yes, it's impossible to recover from the bare server's perspective, but this has nothing to do with my original claim that "rebase is not destructive by default".
> Then just simply say things can be undone with the reflog.
I did. That's why I have problem with the word "destructive", because the whole meaning of that word is that something cannot be undone. This was the main point I was trying to make.
>> moving a branch does not "change" the history (= the commit graph). > It will point to an entirely new commit graph. This is a change. > The commit graph you refer to may very well be orphaned (ie deleted) at this point.
No, the commit graph doesn't change when moving a branch. "git reset" will move the branch and not change the graph of the commits in any way. That's why "git reset" is as destructive as git rebase. Yeah, git gc can then actually alter the commit graph (i.e. drop orphaned commits), but that's another point I've explained and warned about in my post. It doesn't affect the non-destructivity of git rebase or git reset.
But maybe you're just seeing the reachable commits as "the commit graph" and the reflog as something special, leading to our misunderstanding. I don't - both reachable and unreachable commits are part of the commit graph internally.
> That's why I have problem with the word "destructive", because the whole meaning of that word is that something cannot be undone.
The only problem with this definition is that it is a bit useless, nothing is destructive. Even without the reflog (or stash) you can always restore from backup, or even an ad-hoc locally cloned repo (git makes this especially easy). I agree it's good to know you can undo things and what the built-in facilities are, but it doesn't further elucidate the issues with rebasing as a workflow.
The original statement:
>> Rewriting history with rebase is a destructive operation and I would never advocate for it to juniors. Merges are much more straightforward because they follow the git mental model of acyclical directed graph.
Is clearly in the context of an ongoing workflow involving rebases. Doing an operation and then immediately or even later just undoing it by reverting from the reflog or some old ref is not what is referred to, nor is it relevant to their objections (obvious as they are comparing it to merges).
rebase is a destructive operation from the perspective of the branch the moment you do it, and then when you push it becomes someone else's problem very quickly. They are essentially on the conveyor belt to be destroyed. 30 days isn't even that long, and hoping that your coworkers didn't blow away their local repos doesn't scale.
I also don't see how quibbling with the wording "rewrite" vs "alter" is useful.
>> It will point to an entirely new commit graph. This is a change. > The commit graph you refer to may very well be orphaned (ie deleted) at this point.
There's no precise generally agreed upon definition of "commit graph" in git near as I can tell, even the commit-graph caching facility takes a --reachable option and this is commonly how it is used with many tools, and in books "commit graph" has often been used to refer to a graph starting from a specific commit (as in "git log --graph displays the commit graph", "...when using git log -g to examine the reflog instead of the commit graph ..."). Having said that I won't argue the point about wording and I don't disagree your definition is valid. But I don't think you can claim some authority without a mutually agreed definition. This is also an odd time to start being pedantic.
The actual brass tacks on this is that moving a branch is a change in the mutable repository state. This hangup on the commit objects being immutable while "only" the pointers change is a meaningless distinction for the matter at hand. Once you get to the most common use case of git, using shared repos, those commit pointers (branches/refs) are what get published, and that is what defines the shared constitution of "the commit graph", so they are what matter.
> But maybe you're just seeing the reachable commits as "the commit graph" and the reflog as something special, the reflog as something special
Well it is special. It's repo specific, not shared, and non-essential to the normal operation of git.
But that's not the objection. The main one can be boiled down to, what gets shared and why it gets shared and how the graph looks to clones is more important than defining what is the graph in a private repo. Especially when making someone understand the consequences of rebase vs merge.
> I don't - both reachable and unreachable commits are part of the commit graph internally.
I mean this isn't wrong for one general definition of commit graph, but it isn't always a terribly useful definition.
git itself puts virtually no restrictions on this internal commit graph as defined. A git repo/packfile can contain any arbitrary collection of commit objects, you can have two completely unrelated set of trees with commits. You can clone the linux kernel source and then say fetch react into the same git repo, and now you have a disconnected commit graph. This is cool and all and there are some practical applications for it but I don't see it how it is related to the issue of history destruction/rewriting in the context of a shared rebase workflow. Thinking in terms of reachable, strictly non disconnected subgraphs is often more useful.
The rebase crowd seems to be obsessed with it due to wanting linear history.
In my opinion, the reason for all of this is due to how `git log` shows history. If you merge a branch, then every commit in the branch is now scattered among your mainline history, making it confusing to see what is going on.
So, instead of fixing `git log`, you end up rebaseing (and possibly squashing) on top of the main branch, then doing a fast forward commit so you get a "clean, linear history".
But, if git just change the default of `git log` to `git log --first-parent`, then so much of this would go away.
Only top level commits (directly to the branch, fast-forward merged, or a merge commit) show up in the history.
Now, let's say I have a branch that I've been working on for 2 weeks. I have 20+ commits. During that time, I've already needed to merge the main branch into my branch to keep it up to date several times.
I can either rebase the whole thing, to clean up history, remove the merge commits from main, squash unnecessary changes, and so on, just so I can get a "clean" history in main, or, we can just do a merge commit bringing in my whole branch with a good commit message.
I prefer the merge commit because it is then very easy to track _all_ of the changes made in my branch by a single commit. I can do a diff on the merge commit and see all changes. I can cherry-pick just the merge commit and bring in all the changes. If I need to break down what the original developer of the branch was doing, I can walk through all the sub-commits.
There is no need for them to rewrite history and hide their workflow or their process.
But again, the problem is, by default, `git log` shows this as a huge mess, whereas `git log --first-parent` would _only_ show the merge commit, and would hide all the child commits that happened in the branch. After all, the top level merge commit is what is important, we only care about the subcommits when we need to dig in.
> But, if git just change the default of `git log` to `git log --first-parent`, then so much of this would go away.
Not really.
If you don’t care about the individual commits that lead to a merge (you only care about the final merge) then you could just do `git commit --amend` through the whole process. The end result would be the same.
But presumably these people (who aren’t narrowly focused on a linear history) do care. And (as you seem to be saying) sometimes you want to go down into the second parent of the merge commit.
And in that history (with just merges) there’s a bunch of random “synch. points” where they merged the original branch into theirs. Why? Maybe because it had been three hours, maybe it was the start or end of the day, or something else?
Can you look at the three-way diff and see why?
And then there are all sorts of half thought out commits and missteps that could have been edited out if they used rebase.
Using rebase does have drawbacks. But you can get back value proportional to the time you invest in it. So it’s not a simple matter of discovering that `--first-parent` is a thing and then setting an alias for that (really, you thought that would be it?).
And using merges only (including synch. merges) also has its advantages. But no, using rebase is not just about hiding pull request branches.
I don’t think ‘linear history’ captures everything.
In the orgs I’ve worked in, it was important that mainline always (bugs notwithstanding) always be ‘runnable’. Many people (myself included) will happily commit in-progress changes to branches they’re working on. If these in-progress changes get into the mainline branch - even as a result of a merge - they become part of its history and break the ‘always runnable’ property.
That's still not a problem, as long as you consider the first parent the canonical parent you can still make assumptions like this along the canonical history, without sabotaging the true history.
The “run on every commit” stance is an argument against rebasing. Without it you have even more freedom to rebase your branches. And there’s less reason to care about the “true history” (bogus concept).
That's the simple and safe use of rebase. Rolling up your local wip commits into a coherent story of one or a few clearer commits.
The more-complicated but still safe use is to rebase your local branch on top of an advanced upstream before pushing. Some people prefer merge for this case, but it is somewhat impure.
The problematic use of rebase is on pushed branches. Especially if you have collaborators.
> The problematic use of rebase is on pushed branches. Especially if you have collaborators.
Yep. As soon as you have a PR out, and have to address comments, you have a public branch and can't safely rebase it. So you can't roll those new changes in to the "clean" history, If upstream also has changes before your PR gets merged, you'll also have to add a merge commit anyway.
This is a problem of GitHub, not Git. E.g. GitLab preserves comments as well across rebases as any commit changes.
"Don't rebase public branches" is an easy way to say "if someone has built something on top your branch, don't rebase unless you expect them to also rebase." 90% of the time this is not actually harder than merging, though, and still produces a better history and more atomic commits.
At this point I'm even only 99% firm on "never rebase master", as long as release tags are truly immutable.
A bit overstated but yes. Merge squash works every time.
Rebase in my experience at work led to inexplicable conflicts several times I tried. When that happens the rebase preachers will be nowhere to be found, and you’re on your own.
Never did that exactly. Think I've either, started a new branch from the waiting one or, started a new branch from the main and then pulled from the older feature branch. Have no recollection of any problems, but it is rare enough to not have tried it recently.
can you explain? to me SOP is feature branch, work there, commit occasionally, then when it's good review, squash and merge (99% of the times done on remote). the history is clean with the ticket ID attached to the final merge commit (as the feature branch is treated as atomic)
I do as well and I also cherry pick a lot specifically with —no-commit to pull together disparate ideas from various sources. I am always fascinated when git usage comes up in threads because there are so many ways people use it that differ from my own and we all seem to find value in it!
I always feel like the main issue with tools like that is that you'll then just have to learn a separate syntax that's not portable and will be even harder to find instructions for.
For all those who think that Git is easy, it's not.
It takes a student about 20-30 hours to really understand all usual Git commands. I'm not talking about something bizarre like ls-remote, but really understand merge, rebase, bisect. Not just syntax, but all the consequences. For instance, why rebasing something already pushed to public often leads to false merge conflicts, how does stash really work, etc.
If you’re including rebasing, then I can believe this (rewriting history is nuanced and complex), but I’m not sure I’d be teaching that straight away. My recommendation is usually only to start using rebase once you’re already very familiar with the rest of git, since it’s almost never necessary to achieve what you want.
Also, bisect is definitely not common, and the vast majority of my colleagues wouldn’t know it exists; I’d place it into the look-it-up-if-needed category.
I’ve used git successfully professionally for almost 10 years and I’m not lying when I say I’ve used rebase about 5 times. We just merge. We squash sometimes. I think it must be because I’ve always worked in smaller orgs but rebase just seems to always be over complicating something simple (a merge). I get that there’s a benefit of a cleaner history but to me the benefit of simplicity merge offers makes it superior
Done right and understood, it is as simple as merging, if conflicts arise no difference to conflicts via merge?
But you get rid of those ugly merges back and forth and unviewable histories.. Not sure why not more people care, I guess it is a little OCD I do that for my stuff.. however the 20-50+ user projects where everyone, often multiple times merges develop or even other branches in.. really unusable history right away, lets not talk about any point later. Git has derailed to a mere backup system where you can jump back, but understanding changes later becomes impossible :(
What people also rarely know: A linear history is without lies, but one can sneak additional changes into a merge commit that was in neither branch quite easy - I hate that!
Most people don't care about most stuff, it's pretty normal.
But here specifically: Most developers don't read code, nor documentation, and essentially no one reads commit messages. VCS is a WOLM - write-only linear memory - to most developers. That's probably also why most people don't care to write any kind of reasonable commit message - the software requires them to input something, so something is input - and also do not care about unreadable (and also unusable) VCS histories. They're never looking at it, it's only written. Hence it's meaningless for them if it is extremely difficult to trace changes or hard to bisect, they don't even attempt such things. There's a reason the history browsers in all the online tools and IDEs effin suck, it's because on average nobody uses history.
I know, I know, this gets across very elitistically, but it's just how most people do their jobs. They get bugged to do a thing, so they sort of do the thing in order to not be bugged any more about it. I'm pretty sure that caring for these things is some kind of OCD like you say, i.e. a mental disorder.
My biggest issue with rebase from a learners perspective is that you get conflicts that simply don’t happen with merge.
I’ve had experiences where I’m trying to rebase a branch and it keeps flagging my most recent changes in the rebase when the intention is that I want all previous changes applied then as a final step show me a conflict if applicable, don’t show me one when it’s not the “final” step.
Admittedly maybe I’m doing it wrong.
I also don’t like how rebasing “main” onto my branch actually seemingly takes my branch and rebases it onto “main”? Maybe that’s a fault in my IDE not entirely sure
I think the difficulty with rebase and merge is much about semantics. When you "rebase" what does that word suggest to you? To me it sounds like "Base Again". So I'm changing the base from which I started the current branch to something else?
Not quite. That would mean that my old base is REPLACED by something else, right?
Instead rebase means take some other branch then put all the changes in the current branch on top of it, keeping everything in the current branch but just adding them on top of something else, right?
It grokked with me when in my IDE (WebStorm) I saw the menu-option allowing me to choose another branch and perform "Check Out and Rebase on top of the current branch". So you don't just "rebase a branch", you combine two branches. And when you combine two things here the order matters, and it is easy to confuse the order.
Similarly if you merge two branches, which one overrides? The left or the right operand? It is easy to get confused. Maybe I am. It is not my fault.
It does not help that git really is checkpoints rather than deltas. The semantic difference is usually ignorable, but it can matter when you're trying to rewrite history. It's coming up with deltas on the fly, and if it gets confused it can get very very lost.
Every rebase really is at least three branches: what you're merging into, where you started, and where you are now. And God help you if somebody rewrote history on any of those.
The fact that all of these have terrible names makes it so much worse.
In my org we require all important data to be listed on a pull request, then enforce squash merge. If you have multiple distinct changes you want reflected in the history then you make multiple PRs.
I use rebase essentially every working day. I use rebase for managing my local tree of patches, regardless of interaction with colleagues. Merge is fine, I guess, but do you never get the order wrong and want to reorder things before you push? Or keep moving around some local debugging patch on the top of your stack, without pushing it to your colleagues?
IMO merge is almost never what you actually want, unless you've been working separately for a long period of time (and generally you should not being doing that because it leads to surprising conflicts / regressions at merge time).
It depends what you mean by separately. Big organisations can have dozens of teams implementing unrelated functionality with asymmetrical overlaps for conflicts. I've never found a situation where rebase was appropriate for this kind of setup.
If the functionality is unrelated, I'm not sure why rebase would be inappropriate. It just wouldn't matter much other than keeping history clean (and rebase would make for cleaner history than merges).
What is the benefit of this to isolated teams? Rebasing a shared main that is under a constant stream of PRs is tedious and time consuming. What is the justification of this in terms of time spent for a company paying for that time? How is that time recouped via the availability of a linear history?
Git merging is completely stupid. It collapses multiple changes into a single commit. The original chain of commits is referenced, but in a useless way that only complicates the git history.
When you merge 17 changes from foo-feature into master, master has only a single commit. You cannot bisect master to determine which of the 17 broke master.
The 17 commits are there, but only in their original form, based on some old commit. Those 17 original commits are not equal to the single merged commit. The single merge could have a bad merge.
If there is going to be a bad merge, it's better to have a bad merge in one of 17 commits being individually rebased, than to have a bad merge in a single merge bomb that conflates 17 commits.
Your little branch of the original 17 commits should be purely a private object; it does not belong in the upstream. It's just a historic accident that your work was originally based on some three-week old random commit that happened to be latest at the time when you started. There is no need for that to be published. Your job is to make sure your code is based on the latest commit and promote it to the branch, as a sequence of individual changes (which all build and pass unit tests, etc).
I've literally not typed "git merge" since 2010, and around that time I learned how not to have Git perpetrate unwanted merges on me by learning never to type "git pull". "git pull" can be configured to be "git pull --rebase", but you will forget. I trained myself to do "git fetch", then "git rebase".
In all my last three jobs, the review tool Gerrit was used. Gerrit is based on cherry picking, which is rebase. Commits are submitted in their original form (not necessarily rebased to the target branch, but indicating which branch they are for). When approved, they are submitted and that means cherry pick. Maybe Gerrit can merge; I've never seen it used that way.
Gerit has rebase right in the UI. You can take a commit and rebase it to its logical parent (latest patch set of an apparent parent with which it had been submitted together), or to the current head of the branch.
FWIW - I've worked with both rebase workflows and merge workflows. If you rebase a long lived branch what breaks is /your/ code not the rest of the codebase in a rebase workflow vs the opposite being true with merges.
It highly depends of culture of specific team. I've seen teams obsessed with "clean" history, rebasing and squashing everything they possibly could. I believe it's a matter of taste, I never understood the argument.
squashing is a form of rebasing, albeit a narrower and simpler case. Indeed I do not understand why rebasing is so popular instead of merging when you're going to squash at the end anyway.
You have to fix conflicts either way so what's the difference? I suppose when merging you only have to fix the cumulative conflicts, while when rebasing you have to incrementally fix the conflicts introduced by every commit which is annoying. I usually squash a branch (rebase on where it diverged) before rebasing to fix this.
It’s worth configuring rerere (reuse recorded resolution): https://git-scm.com/docs/git-rerere for merges and rebases, because repeated rebases or merges of similar code will become substantially easier.
> After rebasing you basically have two versions of the same commit, and that is the root of the problem.
No, you have two commits. Git commits are snapshots, not diffs. Git's data model is simple. The fact that people do not bother trying to understand it is the root of the problem.
Rebase got a lot less annoying now that git has gained the update-refs ability. I'm often working on multiple branches simultaneously, and update-refs lets me easily work on them as a stack.
Here's how i use it: I call my HEAD "bl-dev". Let's suppose I have 5 commits on bl-dev that aren't on main. Each of those commits is a branch with a functional name. If I type "git rebase --update-refs origin/main" all 6 branches get updated, not just bl-dev.
Let's suppose the next thing I do is add something to the bottom of my 5 branches. I dont switch to the branch, I stay on bl-dev and add a commit to bl-dev. Then I type "git rebase --update-refs -i HEAD~7" and move the commit up in the stack to the correct location and again all 6 branches update.
"git rebase --update-refs -i" also gains the ability to change which commit any branch points to.
I don't actually type "--update-refs", it's a gitconfig.
I don't know how successful of a metaphor this would be for students, but for developers who haven't had any familiarity with DVCS that I have worked with, I use a coat-rail analogy with some success:
You start with a "main" rail of coats. Each coat is given a number, indicating what order it is in.
You start adding coats to a new, empty, rail, using the next number. When you are done collecting coats on this new rail, you want to move your coats onto the original rail.
In the time it has taken you to collect the coats onto your rail, someone else has already added coats from their new rail to the end of the "main" rail, and they used the same base number that you have - which means you cannot just add the coats from your rail onto the main rail, else the numbering would be broken - we call this a conflict. (NB: There is a magic guardian of the rails that means this just cannot happen, which is a plot device that is just to make this metaphor work so don't question it, it's just a metaphor.)
To resolve the conflict you have the following "rebase" options:
- Slide all of the coats on the main rail up to make room for yours, and update all of the numbers on your coats, using the new number from the last coat as the base number for your rail's sequence. This ensures that everything on the main rail is "before" your rail. This is the equivalent of a rebase. After this, you can then put your coats on the end of the main rail without conflict.
- Then there is the option to take a copy of the main rail onto the front of your rail, and organising the coats one-by-one (much like in the above), placing your coat(s) in between some of the new coats on the main rail, then forcibly replacing the main rail with your updated rail. This last one is just as drastic in reality as it sounds in this metaphor. We call this the "interactive rebase" and it is rewriting history for anyone who has used the main rail before.
- Finally we have a merge which means adding one big magic coat at the end that takes the coats from both rails and combines them in a singularity, with the new base number being Hawking's radiation or something, I dunno. The metaphor is no good for merges.
(Truth be told: the entire metaphor started in my head some years ago with just the visualisation of that distinct coat rail sweep noise to squish the garments on the rail to onside to fit what our main protagonist is holding onto the rail, the rest I just back-filled over time)
If you teach all the concepts of Git (e.g. commits point to parents but not to children or branches are labels that point to a commit) properly it takes some time but then you get a lot of the more advanced things such as rebase kind of for free. In my experience, people often struggle with those because they have no clue how Git internally works. I had the same problem but when I looked into that it clicked and suddenly all the commands made a lot more sense.
Using rebase is crucially important to anyone who is ready to start using git to track a remote repository and produce new changes to be pushed must learn about rebase. You have to use rebase to rewrite your unpublished commits over the latest upstream in order to be able to push a fast-forward change.
Many new users of git don't have the luxury of learning how to use local-only git with no remote.
Now rebase is a farm implement: a mechanized cherry picker. Cherry picking should be taught first, and then rebase explained in terms of being a multi-cherry-pick operation.
Before teaching cherry picking, you have to teach that Git is based on snapshots and not deltas. Git cherry-pick is part of tooling that is inside Git, but external to its snapshot-based storage model.
When you cherry pick some commit into your current branch, the tool finds the common ancestor between your branch and that commit. It then does a three-way diff using the files in the cherry-picked snasphot, your own branch snapshot and the common ancestor-snapshot. The three-way diff operations produce a merged version that becomes a new commit: another snapshot.
If I ran a class on Git, we would spend part of a lecture doing manual merges with the diff3 utility: I would have the students take some ancestor file and make a "my" and "yours" with different changes, and merge these with diff3. We would go through conflict markers and all that.
Old time hackers who used other version control systems before Git knew all this stuff already. The first time I encountered conflicts, I already knew how to resolve them. Just a few git concepts were new like having to add resolved files to the index.
Imagine you know nothing about version control. Words like "unified diff" ring no bell. You've never seen conflict markers. You've never applied a patch or produced one.
> I’d place it into the look-it-up-if-needed category.
Disagree. Bisect is so useful in so many different scenarios that learning about it and the basics of how to use it is a great way to get people into git. Obviously not right at the beginning of their learning curve but as soon as the basics have been covered satisfactorily.
I see bisect as a git superpower. Also grep (with rev-list).
It's like knowing regular expressions. Not that one writes regexps every day, but this knowledge/skill definitely uplifts one to the next level.
I basically have to relearn regex's almost everytime, yeah it's a bit easier each time, but I still have to go back through some basic examples to get those old neurons stirring.
> I’d place it into the look-it-up-if-needed category.
Knowing how to use git bisect is what elevates a programmer to the next level. Just understanding how it works gives you a new way to reason about bug finding and fixing (or feature development using old and new), and then actually using it can make you a bug fixing master.
It feels like a weird and magical superpower but realistically I find use for it about once every 6 years and even then I'm fairly sure I could have done without it.
This was not my impression when I first learned about it. I thought I'd be using it all the time.
> and even then I'm fairly sure I could have done without it.
Just understating how git bisect works is the real superpower. The tool is a nice add on, but often you're right, is easier to bisect by hand by going to a known working commit and then doing a smart bisect based on the code you think might be offending.
But at least knowing the concept of bisection is a huge game changer.
I'm pretty sure every programmer learns about binary search near the beginning, and even without that has searched through some alphabetical listing without starting at aardvark (if not, I'd argue they have zero hope at much of anything abstract).
Really feels like a stretch to credit git, of all things, with that fundamental understanding. It's like saying you need to operate a nuclear power plant to understand the benefits of locking doors.
When people search through alphabetical listings, they usually don't use plain old binary search, but instead use something like interpolation search. For example, if you want to find the word "xylostroma” in the dictionary, you aren't going to start in the middle. Instead, you'll start about 90% of the way through, and then make adjustments based on how far off you were.
Yeah but that is only optimal if you know the general location. Often when you have a bug you don't know where it snuck in, especially if it's something like a race condition.
I don’t disagree, and I personally use it, but it’s a decent distance from teaching students the basics of git. I’d bundle it into a group of tools that are useful to know the existence of, and look up the docs when needed.
But 20 hours seems a worthwhile investment to me for something they will likely use for at least the next decade of any serious work with software? I'm assuming that this doesn't mean literally 20 hours of instruction.
There are a lot of people who could benefit from git (or who are forced to interact with it sporadically) for whom this is a real problem.
The problem is also, if you don't use it daily git knowledge decays fast.
There is a piece of advise that was making the rounds (unironically) at the research institute where I worked before: "Before you do anything with git, first make a local backup copy."
Can't agree. While learning git isn't trivial it is still easier than it was learning Subversion and at all places I have worked at a decent share of the devs understood git. If a dev cannot grasp git I would not trust them with any non-trivial code base.
you mean merging branches? because in all other aspects, subversion is significantly simpler and i would say far better UI. Much of this also comes from not being distributed
No it's not. The recovery strategy when you've gotten your repo into aess is "wipe, clone, restore local backup, try the git thing again".
But even then, what's the benefit of using a strange git command, which means I have to track another bit of state only visible to git, which can easily be misused (by applying the stash to the wrong commit), and anyway doesn't save the actual working directory I care about (untracked files), over simply making a file system copy?
The benefit of using a strange git command would be learning what a strange git command does and then it's no longer strange. You'll learn how to create, read, update, and delete that new bit of state in git. Then you won't have to wipe, clean, restore local backup, try the git thing again.
You can start saving your work somewhere else, getting a fresh copy, and then try that git thing again when you know how git stash works. What you're proposing is like not learning how an incremental compiler works and suggesting to wipe, clean, restore local backup, try to compile the thing again to fix compile errors every time you have a compile error instead of learning how an incremental compiler works.
The problem is that git alternatives which achieve pretty much the same so not take 20 hours to learn. A lot of those 20 hours are spent because git has a poor UX and a poor execution model which almost requires understanding it’s underlying working and data structures to be able to use it effectively.
There’s several alternatives where you can be as effective if not more in a fraction of the time. The one that I preferred was mercurial, but unfortunately since GitHub was so successful we are all forced to learn git.
I used vim for 30 years by now, 15 of those it was my main text editor/IDE-like. I was one of those who would spend hours on achieving the perfect configuration and mix of scripts. I was one of those who thought smugly "I'm so much more productive than those idiots using Eclipse or Sublime".
But you know what? I moved to VS Code a few years ago and my productivity only increased. Everything is more intuitive and discoverable. Everything just works with no tweaking. All the convoluted macros and text manips I painstakingly crafted and memorized in vim are achievable in a few clicks in Code.
Spending 20+ hours learning emacs or vim is quite literally a waste, it's time we all admit it.
(I still think one should learn the basic cursor keys and how to save/quit in vi because sooner or later you'll have to edit something on a remote server with only vi. But learning emacs is still completely useless, though.)
> Spending 20+ hours learning emacs or vim is quite literally a waste, it's time we all admit it.
For me, with my ability to learn at the rate that I do, Emacs provides the best solutions for Git (Magit), email (mu4e), calendar (Org), to-do lists (Org), note taking (Org) and more.
And I don't use software like mu4e because I already use Emacs; I use it because I have installed and evaluated dozens of email clients over the years and concluded that mu4e is the best for me given my requirements.
For people who aren't as particular about the software they use, Emacs may very well be "quite literally a waste".
Yup. I couldn't ever use Git without Magit. The nice part about Magit is that it's already a very discoverable interface and, like Gitless, it just runs git commands asynchronously, so you can inspect the exact things Magit is doing. Yet it's very painless to blame, rebase, squash, cherrypick, ... compared to the Git CLI and other graphical tools I've tried using in the past (mostly the built-in Git interface of IntelliJ).
Given the poster mentioned using Vim for 30 years, I assume he is blindly lumping in Emacs because I've never had to maintain macros or text manipulations -- smartparens and the built-in text yanking features have been sufficient for refactoring code and structural editing.
Most of the things I do with Emacs, people will claim are doable with tmux and a dozen assorted shell utilities, which IMO is just false since you don't get any user interface or convenience close to e.g. Magit, TRAMP, notmuch, eww, and so on.
For instance, one of the supposed benefits of Vim is that you can SSH into some box without your editor of choice installed and still edit files. Why settle for such an experience when TRAMP lets you stick with your current editor, with all of its configuration and plugins, to modify files remotely?
Likewise, notmuch is very good at organizing threads across several mailing lists, whereas Thunderbird has needed me to Google dozens of questions, click lots of buttons to set up filters, and still end up with nothing better than a flat list of e-mails in my inbox.
Since my post is getting long, I'll lastly mention eww is great for browsing things like Javadoc, Codox, the Common Lisp HyperSpec, and other statically-generated HTML documentation.
The most time investment I've needed into Emacs was essentially
sudo port install emacs-app
Then asking a friend to guide me through installation of a few packages. I don't use distributions like Doom or Spacemacs, and the last time I modified my init.el was apparently 2022-05-13. I got into Emacs in 2020.
I've tried time and time again to try using VS Code so that I can help some friends get into programming languages like Clojure, but I always find myself spending 30 minutes searching how to do things I take for granted in Emacs, like automatically indenting code as you type rather than manually hitting "Format Document" or creating a save hook running that function.
I have to ask: Did you just quit Vim the editor or also Vim the keybindings?
Because I can totally see not using the editor, but the keybindings, even the basic ones every emulation plugin manages to do well, are such a big win it's hard for me to believe I'd ever give those up.
Also, at least from my perspective, VSCode and Neovim are eye-to-eye, in the sense that both get most of their magic from interfacing with Language Servers, which both do very well. Though even the more approachable Neovim still seems to have a fetish for configuration/a hatred against sane defaults.
I spent hours using emacs as well, and then moved to VSCode. Everything's just so much faster in VSCode because 1) there are more plugins and 2) I can test and quickly learn plugins.
I don't disagree, but I think a followup question would be "how configurable do you really need your text editor to be?"
Don't get me wrong, I'm all for tinkering with stuff and customizing to oblivion, it's fun and cool, but realistically how much better of a JS engineer are you going to be if you've customized the hell out of Emacs? Maybe a bit more, I'll concede that, but fundamentally I don't think it's going to be categorical.
Interestingly, the features of Emacs, Vim and VSCode are largely orthogonal (on top of any basic text editor). Further, a lot of people advocate using CLI instead of most VSCode features. Waiting for the holy grail, you can still learn from each:
* Vim key bindings for text edits
* CLI commands for Git
* Emacs modes such as Magit and org-mode
* VSCode for tight Typescript integration, nice block structure visualisation etc.
I hope some editor will combine this all. (I think Emacs will be best placed to achieve this with its extension ecosystem that already includes Magit, org-mode, evil-mode etc.)
VS Code doesn't have anything comparable to Org mode. The "VS Code Org Mode" extension[1] has maybe 1% or 2% of the features of Org mode for Emacs. Go read Org mode's manual[2] from start to finish (134,062 words) and then direct us to an extension for VS Code that has comparable features (e.g., plain text spreadsheets).
I've tried org mode several times and I think it's both overkill and not something most people would benefit from considering the time investment needed.
> I spent a lot more than 20h learning emacs. It was my main editor for years.
Yeah same, are people saying 20h commitment is a lot now ? Not sure if I've gotten old, or if things are easier to learn now - but a 20h commitment seems like very lite.
(1) If I can spend 20 hours learning something boring or 20 minutes learning something else boring but they do the same thing I would prefer 20 minutes.
(2) There are more tools to learn. That 20 hours/minutes is being multiplied by an ever growing number.
Thing is, git is pretty much everywhere. I don't think I've had a single job in the last 15 years or so where the client or employer wasn't using git. 20h to learn a tool that will be used daily for decades (yes, I expect this trend to go on for at least another 10-30 years), and the choice is not mine to make, is not much.
I should say, if the choice was mine, I'd probably still choose git. It's powerful, it's something I can already make really good use, and I feel comfortable easing new people into it. Regardless, the point is that I don't have that choice as an individual who's part of a team or an organisation.
On the other hand, I can choose my code editor without interfering with my fellow engineers's own choices. Here I chose the one I could learn in 20 minutes and get better the more I used it.
It took me very little time to get productive with VS Code, and then a few more days to get used to most shortcuts I use. Everything else is accessible via the command palette and more shortcuts can be learned as their functions become used more frequently. Until that, the palette is an excellent interface. It also serves as a discovery mechanism for features.
With emacs I had to learn a lot before I started getting productive. I got very good at it. Multiple cursors, window jockeying, buffer wrangling, the works. Any functionality that fell in disuse for some time, I'd risk forgetting their shortcuts. If I did, the only way to use it would be to somehow remember the shortcuts, probably by interrupting work and googling, or by trying to navigate its archaic menu systems.
With vim it's not very different. I still use it more or less daily but mostly for single file editing over SSH, since even if I need to do more complex editing on a remote server, I can use my local VS Code.
I don't think people have to use VS Code. I work daily with other engineers who use it, but also any of the Intellij editors, vim, emacs, you name it. We can all live in harmony and collaborate just fine. Not so much if each would chose their own VCS.
> Yet most people just want to use VSCode and don't want to invest 20+ hours learning emacs or vim
To expand upon this, for version control there are pieces of GUI software that can allow you to handle the typical workflows without touching the CLI. Packages like Git Cola, GitKraken and functionality that's bundled with most IDEs as well.
In addition there are also platforms like GitHub, GitLab and Gitea which will let you easily get changes merged across branches with either a single commit or different strategies, all with the press of a button.
I don't think that there is a reason for everyone to do a deep dive into Git, aside from their own curiosity, or when using complex workflows.
I've been using Vim (well, neovim now) for a long time, it's my primary editor, and I think I'm reasonably good with it, but I really don't think I'm appreciably more productive than the average VS Code user.
The thing I like about Emacs is: there is always another 'prestige level' for you to strive for, because you will never master everything. You get better, and better, and better ... positive feedback, increasing ability to get stuff done.
When I teach students git I show them init, status, add commit, diff, and log. That's all I focus on for several weeks.
That's enough to track changes to your own projects, see the benefit of tracking what you've changed, and build the habit of commiting frequently. IME, adding anything more about remotes and branches is overwhelming to the point they don't bother with git because they aren't going to distinguish the fundamentals from those more complex features.
Obviously branches and remotes are vital in real development, you just can't expect them to learn it all from the start.
Students have to understand the concepts first, then the commands.
Many of the commands don't make sense, and many of them are dumb implementations, which wouldn't be done that way in a clean redesign of git, even if it were based on the same storage model and concepts.
Some of the commands are essentially tooling that is external to the Git storage model and emulates delta-based version control on top of the native snapshot model. The commands lead to fundamental misunderstandings such as that commits are changes.
No, definitely not, but if you start using it you can fall in love with it easy and suddenly find merging overly complicated :) I dread the merge workflow my colleagues do in a strange multi-repo-with-subrepos project - I just rebase all branches in all repos and find it even simpler and be happy and can still see my history at least :D
Well that's actually "create a branch and then checkout that new branch". And you can use "git switch -c" if you feel the "checkout" verb is confusing.
The command to just create a branch is "git branch". Git branch won't touch the HEAD pointer, that's what switch does.
And if "git branch -c" were the command for "create branch and switch", it would be criticised in the exact same way: creating a branch and switching to it would be a different subcommand to that used for switching existing branches. Except now the branch command can also do "switchy" things to your HEAD pointer, but only sometimes (and the fact that checkout also only sometimes changed the HEAD is why switch was introduced).
What would be nicer is an interactive CLI interface for git. For example: "git switch <branch>", then if the branch doesn't exist, git asks you:
"Branch <branch> does not exist; create it? [y/n]"
To be fair, I've long used "checkout -b" and I don't think new branches need this, but it might be helpful in special situations like rebase/merge conflicts. I appreciate this sort of interactive CLI in other tools (apt, vite, etc.), versus having to look up the manual for the right options/switches. It's a different approach that might be more annoying to some power users, but is more friendly to most other users.
(This is also where GUIs are helpful, since they display most of the useful commands and options for you. Plus they can do things like see line history by commit and list your stashes for super-easy switching; GitLens/GitLess in VSCode do these.)
I think git is fairly easy as far as data structures and such goes but the issue is that you need to understand it to use it well.
In the same way that you don't need to understand the inner workings of Google Docs, Git should be something you don't even have to learn. Sadly we're not there.
Yes, if I want to find someone to work with I certainly want someone that do care about not making a total mess in the VC history. Selecting for that seems like a very good thing, regardless of whatever else they contribute.
Not an issue per se, but there are plenty of candidates who might not care about merging vs. rebasing but have an interest in other things. So many interviews consist of interviewers hoping you parrot their own worldview back to them. It’s not a good way to build a team.
I don’t care. Why? Because it’s largely a meaningless distinction that has nothing to do with the goal at hand. It’s like eMacs vs. vim or tabs vs. spaces.
Another way of thinking about interviewing is this: you say you don’t want to work with people who don’t care about this problem you care about. That’s fine, I guess. But if you changed your viewpoint to “I want to work with candidates who care about interesting things” then you might find you ask more about what they care about and less about what you care about.
I'm with you. I don't care. I prefer merging but if someone insists on rebase + squash commit I don't care because it doesn't make a difference.
Sometimes I'll tell a rebase person that if they don't like merge commits they can use git log --no-merges and you won't see merges. The majority of people I come across who prefer rebase for "clean history" reasons don't know this.
If you truly understand the difference between rebase and merge you'll come to the conclusion that it doesn't matter what you use.
It's super easy to avoid. You dont have to learn git's internal model to use it and it barely matters whether you use merge commits or rebase in 99% of code bases anyway.
You might as well judge people on how well they know xkcd comics or VIM keystrokes.
I don't get it. Why would you spend so much time on git? You can teach the core ideas in 2 hours, let them do some homework (4 hours?), and the rest they can figure out later. I don't see the value of understanding git in detail.
I've been using git for twenty years and I still don't intuitively know what half the operations do. When my local tree fucks up, it might as well need open heart surgery, because I can fix my repo as much as I can perform that.
I've only taught git like two or three people. 20-30 hours is insane. Could it be that starting from the commands is not the best way?
Git is basically just blobs, trees, commits, and refs [0]. Four simple concepts. Yes the interface is confusing and I don't know it either, but things are easy enough to look up.
You taught people with some sort of existing knowledge or advantage. This is one of those classic pitfalls that comes with experience - we forget how hard things were early on and/or fail to realize when something was “easy” because we had natural talent or passion.
It’s probably best to consider that something like git is _not_ inherently interesting and is more likely (subjectively) boring and tedious to humans who are comfortable with simple saving and undoing of files.
The porcelain parts of git are terribly complex (though improving; especially if you just apply the config changes it suggests during normal operation).
However, the core data model is simple (it is a content addressable store, where each commit contains pointers, forming a directed graph; the pointers are unforgeable, so the graph is acyclic).
This meme needs to die. That is one tiny component of the data git touches when working with it.
There are many sources of interacting states, some internal to git, some not:
the local tree of commits
remote trees of commits
a possible origin relationship between the two
branch labels on each local and remote with possible relationships between them
the HEAD label
stash
index/staging area
the actual working directory
.gitignore
A simple question like which files are being tracked is actually a complex result of several different places in the above. Untracking a file is consequently extremely non-obvious.
Each commit in Git contains a file tree. So a single commit is what’s needed to answer the question ”what files does this commit track?”, which is the only question about file tracking that makes sense to ask in Git.
Describing Git’s behaviour in terms of other version control systems’ semantics (e.g. Subversion or Mercurial) is not necessarily easy, but that’s not what’s being claimed.
That's incorrect. Git considers files tracked if they are in the last commit or staged. As far as I am aware it considers files untracked if they are not tracked and not ignored (though git clean -x documentation suggests that it considers such files untracked and simply ignores them when listing untracked files, which also makes sense). So un/tracked depends on last commit, staging area and possibly .gitignore.
> Remember that each file in your working directory can be in one of two states: tracked or untracked. Tracked files are files that were in the last snapshot, as well as any newly staged files; they can be unmodified, modified, or staged. In short, tracked files are files that Git knows about.
> Untracked files are everything else — any files in your working directory that were not in your last snapshot and are not in your staging area.
Any tips for someone outside the industry wanting to up their hobby programming game? Basically just trying to avoid the Useful_Script_v2-Final-Final-ReallyFinalThisTime.py syndrome, and realizing the value of a dated change history.
My suggestion would be to not learn git. Go with just about any other version control system if you’re not using it professionally. I’ve used cvs, Subversion (svn), Perforce (p4), and others. Honestly, Subversion and Perforce were both significantly easier to setup, learn, and use than git. While they all have their growing pains, it took me about a week to get used to Perforce. It took me more like a month or two to get used to svn. I’ve been using git for years and still find it very hard to use and strongly dislike interacting with it.
I second Subversion - though only because it is the closest to an open source alternative to Perforce. Otherwise it has a bunch of problems, like three separate half-baked implementations for shelving and aside from TortoiseSVN every other GUI front end has been either abandoned or on life support with a bunch of bugs.
But unless you have to work with other people in places like github, etc, it beats having to bother with git - especially for games that have a ton of binary files (which, unlike what some people will tell you, you want to have both version controlled and in the same repository).
Hell, if you really want a DVCS go with something like Fossil, it is still much easier than git, simpler to setup (just a single binary) and has more features (wiki, bug tracker, forum, etc) that you will find useful anyway.
Though personally the best experience i had with VCS is with Perforce, at least in gamedev: check out the latest version, merge any local changes, make modifications in a changelist, shelve the changelist in case i want to stop working on something and work on something else, use the shelve to send a WiP version to a coworker to merge with his changes (or see if things work as expected) or for code review, etc.
Sadly Perforce seems to be bound in a company that tries to sqeeze it for all its worth, adding a bunch of stuff of questionable usefulness, etc. It'd be nice if there was an open source alternative to it that allowed for the same or very similar workflows, all the issues i had with P4 over the years (e.g. merges between streams) were due to how P4 seems to be implemented, not due to anything inherent in the workflows themselves. There is no reason for an alternative to copy all the bugs.
Hard disagree. Subversion is awfully complex compared to git.
Yes, Subversion is initially easier to learn and use than git. It's not easier to set up as it's client-server while git is fully local. Also Subversion is an incongruous mess.
Subversion's CLI is actually sane and much easier compared to the abomination provided by Git. Additionally, Subversion can be used entirely locally, without the need to deploy and configure any server application.
It seems that you are comparing apples to oranges. Building your own SVN server from the ground up can indeed require some effort. Doing the same for Git demands more or less the same level of effort on your part. So, I believe you are comparing building an SVN server from the ground up to something like installing Gitea or GitLab, or using Git locally.
Again, you don’t have to install an SVN server. Just run `svnadmin create REPONAME` and use the `svn` client to import your data into the repository.
You don't have to set up a database for Git, either, and it works entirely locally. Git init, edit or copy in some files, git add, git commit, boom you're done. Optionally add a remote, push to the remote, pull from the remote if needed. If you're working alone, as I do, this is about 95% of the Git I need. Occasionally I clone to a different machine, or use Working Copy on iOS.
So? You've just described exactly the what's achievable with Subversion. The only missing part is adding remote repositories.
> You don't have to set up a database for Git, either, and it works entirely locally.
What database? Subversion doesn't need any special database to work. Just the repository and its working copy. Both can be local and can be created with two commands.
You're just talking past each other. You were responding initially to another user saying Subversion needs a server, and you responded that it doesn't. A different user responded, thinking your statement meant that you thought Git needed a server.
I disagree (with your advice, not your experience). I'm a total amateur and use Git for versioning prose. It is the only SCM that I can easily use across multiple devices and platforms. I don't use it for complex operations, mostly clone, commit, pull, push, branch now and then. I taught myself to use it from the command line. I guess being curious and persistent helped me get to whatever minimal level of utility I have with it.
If you want to go old school there is RCS or SCCS. GNU provides source code (for SCCS there called CSSC). Though *CS are per file not per logical commit.
IIRC Perforce actually used RCS under the covers for storing the deltas.
The main complexities of git really come when you start having to work with other people also editing the same files you are. If you're just doing a solo project, you'll likely be mostly on a single branch, maybe a handful of others if you're experimenting a lot. In that case, you largely just need to know add, commit, push, checkout, and can largely ignore the complexities of merging and rebasing.
You only need to understand {init,commit,diff,status,push} to use git. When something goes wrong you can delete your .git and start new. This obviously not what you want when you have a codebase with collaborateurs somewhere public accessible, but its fine to get used to it.
Telling someone to "delete your .git" is actually TERRIBLE advice. This should never be necessary unless you go screwing around in .git and break internals. It has a high probability of causing irreversible data loss. It's exactly the worst habit to build if you want to start collaborating.
If your repo ends up in a weird state, learn how to fix it. It should not be terribly complicated, especially if there is no rebasing happening.
That's only true if you don't want to undo any changes. What's the easiest way to get back the previous version of function foo() in file bar.py if I already made other commits to the same file, and want to keep those?
Keeping a "bar copy(2) final working.py" file is not about having a nice looking timeline, it makes it very easy to get back to a working state. All you need is copy and paste, or keep the working function as a comment in bar.py. You can see your working code and know you're not messing with it when you're experimenting with bar.py. I'm not saying that's the best way of doing it, but it's a very common use case that's not usually addressed in git guides, where it's all about pushing commit after commit, maybe branching, maybe pushing to a remote with others, but rarely enough focus on undoing mistakes after some experimenting if you didn't branch first.
> What's the easiest way to get back the previous version of function foo() in file bar.py if I already made other commits to the same file, and want to keep those?
You are operating under the assumption that foo() changes in a vacuum. That is an invalid assumption for lots of software changes.
Commits are supposed to contain one logical change each. That logical change may or may not cross function boundaries (often it will).
Reverting a commit should revert the logical change. That is how you accomplish what you're after in git.
I've seen so many people do the "select function body, copy, toggle comment, paste copy of function, try out ideas on this new version" workflow. I've done it plenty too in Matlab and similar during various labs at school. I'm not talking about best practice during software engineering, I'm talking about having the confidence your code can easily get back to a known good state if you want to experiment, like beginners should do a lot of.
And plenty of times functions are self contained enough that you can change the function body without changing other code as well.
The light bulb moment for me learning git, which I don't see get mentioned nearly enough, was this:
Stop thinking in terms of branches, and start thinking only in terms of commits. Pretty much every git operation makes so much more sense when you consider your repo just as a tree of commits (technically a DAG), rather that trying to reason about what it does to the branches. Branches are really just pointers to commits which can change over time, and most git commands don't really care about them.
I think you can immediately get value out of Git, even if you don't understand almost anything. Also there's a lot less footguns if you're not collaborating with anyone else.
Check out the book Learn Git in a Month of Lunches. It's exactly as advertised—only thing to keep in mind is that it was written before the master > main changeover, so depending on your local config you may need to swap that out when you're following along with its exercises.
I taught myself Git in University. Over the course of a decade, I've taught Git to dozens of people. Sure, some people take more time than others but 20-30 hours sounds absolutely wild to me!
The goal here is to save my files, change things, and get back to my old version if that ended up being a bad idea. That's a simple request. 20 to 30 hours of learning for a big undo is bullshit.
Honestly, “easy” is relative. Understanding software engineering is a study of 4 years, so having someone take a course on fit for 20-40 hours is literally half a week of proper studying and you’d know git. I don’t get why there are bootcamps for _only_ JavaScript CSS that take at least 8 weeks of hard work and then half of those to learn git and call it “not easy”. It’s the depth that we want to commit to understanding something that leads to people saying it’s not easy. Invest proper time into it and everyone (in sw eng) can learn git
I’m not sure it would be much shorter with some other version control program. Merges, rebasing and bisect are fundamental to what version control does.
This seems to be the one to try. Gitless appears to be abandoned.
And I like this idea of having an alternate model of interaction with git. It seems to fit well within the git design of having common plumbing and being able to swap out the porcelain.
Yes, agreed. This looks like it is being actively developed. It has some interesting ideas. I like the fact that conflicts are committed, and can be resolved later:
"""If an operation results in conflicts, information about those conflicts will be recorded in the commit(s). The operation will succeed. You can then resolve the conflicts later. One consequence of this design is that there's no need to continue interrupted operations."""
Every single time I see one of these attempts at simplifying Git, I'm reminded of how bad source control is when forced to use in a terminal.
Entering commands in a terminal means you always have have a mental view of the state you're operating in, and having better affordances that are equally as powerful would have made a world of a difference to beginners.
No wonder Git is like C++, where many argue a subset of it would be best to use, but nobody agrees on the exact subset.
I used to be opposed to git GUI’s, but after a few years and working in larger codebases, I love the way they visualize everything. Especially when dealing with hundreds of branches. Sure I’ll use the git CLI from time to time but GitKraken works really well for me. I feel pretty safe seeing my stashes in front of me, etc.
This is cool because Git is almost too popular, there are too many poorly written tutorials, and I have seen so many people get in trouble and Stack Overflow just gives them more commands that get them into an even stickier situation. Yeah, if you learn all its ins and outs then you will understand it, but that shouldn't be needed for a tool which is now so widely used.
Git has improved a lot like erroring out if you try certain operations with unsaved changes, but even this has issues e.g. if you have a file which is untracked in one branch and committed or gitignored in another.
90% of my git problems I’ve solved with two aliases: “git publish” and “git synced”
Publish pushes to the remote (doing the right thing to safety update it, force push with lease).
Synced updates the local branch with the trunkiest-trunk (upstream preferred over origin, using whichever one of main or master exists). It rebases against that remote branch.
Shameless plug: I've tried to build an alternative CLI for Git [1].
I think it's just too hard to create a new abstraction. Git seems too widespread and many people won't switch if they already have muscle memory of the commands.
Perhaps we haven't found the correct abstraction yet.
I think tools like these are fantastic and I do not think that git "is easy" or anything like that. Lots of people do not need to understand the details of the git model and should not have to use git directly.
That said - if you are a senior on a project using git flow or something like it - you should treat understanding git as a concern of the first order. The commands are somewhat obtuse, but the tool is a pretty reasonable attempt to grapple with the underlying challenges of managing disparate work from different people going at different speeds. It is challenging to understand because it is fundamentally quite difficult. Don't under-rate the importance of this to your overall process! Learning git will help you understand the challenges of distributed development (and visa-versa) - they are real and not simply an artifact of bad UX.
I'm the git guy on the team and even I get a bit nervous when having to resolve conflicts or figure stuff out in out branching strategy. 95% of the time I am fine and so is the team but it's when things involve many teams and many commits on the same subset of files that things get hairy.
I will say this: I find git much better than Microsoft's Team Foundation Server (I know it can be setup to use git now) but I like git overall a lot more.
Does anyone think any of the competitors to Git will take off? I hesitate to move to something else because inertia is a thing: all my stuff is on github, the team and all my colleagues at work know it, guides exist for it online, but then again ... if I could master a better tool I could teach everyone... hmm.
After this many years (and previously using cvs, svn etc) .. I really don't want to switch again.. git is great IMO. There is no way that learning to use git is more difficult than say, calculus or basic algorithms.
Most CS professors at my school do not know how to use git. The department insists that professors use git in the classroom, which is probably a good idea given the ubiquity of git. My professors are forced to try and teach it to students without really understanding it themselves, so it's a struggle for everyone.
Perhaps my professors don't know git because they got their PhD before git was invented (2005), and they never needed to learn it.
Always thought there was a good use case for something like this, but as some embeddable library. If you have some software project that shouldn’t require a technical understanding of version control from your users, but should be able to handle version naming and collaboration, you might want something that abstracts away some of the messier parts of the git workflow and touches the file system less. (no merge headers injected in files for example) Bonus points if it just works over git so users could host it wherever.
The gitless vs git section is lacking. I can see the diffrences, but you're not making any claims that it's better. I don't really know why I should use gitless from this page.
That’s not the impression I had. It was immediately clear that switching branches in Gitless did what Git always should have done and that was enough for me
That tracking concept is worse than Git's staging area. It expects you to know what to track before making your changes which is almost never the case. Maybe you can track after the fact; then, it just becomes an alias for `git add`. Why confuse people with this? How is this simpler?
"You can stop tracking changes to a tracked file with the gl untrack command"
Does that mean if I stop tracking, the rest of my changes will be ignored, or that file will be excluded from the upcoming commit? What happens if I track a file for a brief period of time, but edit it before and after tracking? What would I be committing then? Am I expected to understand how this works in order to keep a decent commit history?
"You can always revert a file back to some previous version with the gl checkout command."
So, use a command that's entirely different ("checkout") than the intention ("reverting"). Sounds like Git to me. How is this better?
Git CLI is bad, yes, this seems to be no improvement whatsoever.
I would have loved for gitless to become a mainstream way of using git. Unfortunately it did not... Maybe one could implement the gitless workflow in VSCode to help it take off? I still think their basic analysis and the reduction in state that the resulting system has is spot on (and I believe it is also aligned well with what works in Mercurial/Sappling).
Start using worktrees, eg I have one for development, one for reviewing other people's code, and one for looking at another commit (usually main, using checkout --detach ). You could easily have more.
Another git-compatible, simpler VCS is "got" (about which I don't know enough to compare well). It's targeted to OpenBSD devs but ported to various *nix platforms.
When I was teaching computing at high school a few years ago, I was constantly perplexed why there was no 'simple' VCS built into beginner IDEs/editors/online IDEs. As a starting point, nothing more than being able to save a version (number, label description), and revert to and earlier one.
Looking at how a lot of students learn by trial and error, being able to be a little more verbose on what they're trying/have tried would make this trial and error so much easier, and also be more descriptive/conscious of what they're doing.
Having effectively gotten rid of the save button with auto-save, why didn't we replace it with a 'tag' button?
I’m not sure how long it’s been included, but a non-technical colleague of mine noticed this week that VS Code now has built-in “history” for files in a project. Highly limited compared to a VCS, but very useful for basic change history and reversion.
That's been there for a while, also in PyCharm/IntelliJ suite, but for the purposes of teaching and learning, I've been looking for something that's more deliberate - where you actively say what you've done/are doing. History is nice, but you end up with a lot of it, and it's kind of useless to go back through to find that one specific edit that worked better than the others but now everything's broken.
I like the suggestion of not having to worry about stashing changes when going back and forth between branches, this is definitely a point of friction for me with git. Not sure I’d be ready to learn a whole new workflow just for that though
I don’t know git besides pull and push (never needed it, not an employed SWE). Should I learn it or should I learn shinier and supposedly easier alternatives like jujutsu? I only code for myself.
BIG PROBLEM with git is that it works in terms of editor lines and not in language constructs. It should instead track changes in terms of language constructs. For example, tracking the history of a class individually, an interface, etc... changes could be logged automatically. For this to work it would, of course, require to be adapted for each language.
Furthermore, it could work in architecture terms, such as an MVC files that could be tracked as an individual unit. The architecture for a file system could be defined and enforced in a git file, it could spot missing pieces.
Also, it needs dependency management for these units.
I can see two ways to implement something like this:
1. Completely alter the paradigm for editing software: from editing text to editing the AST (or a projection of it). See Unison [0]. This gives you fine-grained understanding of every aspect of the development flow, but requires rewriting most of the tooling that developers are already using.
2. Layer the smart features on top of the existing text streams. This gives you less control and is more resource intensive, but you don't throw away the entire developer ecosystem.
If we choose #1, we waste millions of man-hours building these tools for every new language.
If we choose to layer our language-specific features on top of text, then why does this need to happen at the Git level instead of at a higher level of abstraction? IntelliJ, for example, can already use the git history to tell you who wrote a given method and when. This kind of functionality could, in principle, be extended to following the history of a method when it's moved or renamed.
Is there any benefit gained by having the VCS itself do the language-aware processing instead of having language-aware tooling layered on top of the VCS?
> It should instead track changes in terms of language constructs.
This does not generalize very well. On top of GIT‘s core data structures and file management facilities you would also need parser plugins for all languages now and additional commands to manage language constructs. This would make it even more difficult to grasp.
I would say version control of text files in general is not an easy problem. GIT‘s complexity arises from the complexity of the problem it solves. Perhaps the user interfaces of subversion or mercurial are easier but I don’t think they lift the burden of understanding version control and why it is necessary from beginners …
That is A problem, but the biggest problem IMO is that it doesn't make it easy to perform the tree operations. Everything has a special syntax, the reflog isn't ergonomic to use if you screw up, the staging area is a bit of unnecessary cruft for most workflows, and the branches-as-pointers model doesn't match how most businesses use a VCS.
Once you understand a DAG and the concept of a hunk, it's easy to say what I want to do, but using the command line tool makes it a pain in the ass.
It also doesn't help that the rest of the user interface is a bunch of complicated behaviors tacked onto marginally related commands.
The underlying model is also weird. You would expect it to be diffs, but instead it's full files. Fortunately you can ignore that.
Git: unnecessarily difficult since it was invented.
It could be something that just sits outside of git and uses the raw git history to figure out what changed in a semantic way. It doesn't have to be part of git.
This sounds like something C# programmers would dream of having or making.
In general this seems like it was done by somebody with not more than a very basic understanding of git...?
This looks like you will get a lot of "Merge main into main".
Learn git properly. It took me a few weeks in my first job, and it was done. I dont understand how people can sit here and lean a million intricate rules of css, rust lifetimes, or whatever else, and refuse to use a tool that works very, very well once learned. Its not easy, its not fun, but its IMO the smart way to do it. You dont unlearn git, you learn it once, and you use it.
what threads like these show is not that git is hard, but that collaborative software implementation is still an experimental and challenging approach.
in absence of a tyrannical dictator, it may not even be possible.
I've used git for the last 7 years and I've had no problems with it. If a developer is not willing to invest their time into learning simple basic git commands and concepts then I think they're in the wrong business.
I'm sorry but someone who is unable to determine problems with a tool they've used for 7 years is a huge red flag. That sort of blindness to frustrations is legitimately a massive problem in this field.
But I never get frustrated with git. ¯\_(ツ)_/¯ I've encountered people who use features like rebase because they like to keep history "tidy" (I assume?), I think it introduces unnecessary complexity and IMO history should kept as-is, it's history after all.
Have you tried Mercurial? The Git user interface is a mess in comparison, since it evolved from a collection of loosely related scripts rather than being cohesively designed. Git does the same things and works fine once you learn it, but there's a lot of unnecessary inconsistencies to learn.
Learning how to handle common edges cases is incredibly frustrating and challenging. It's rarely obvious what's wrong and how you're expected to fix a problem. Even worse, the stakes feel incredibly high. If you do something wrong, you risk loosing your work. Or, even worse, messing up someone else's work.
It does admittedly have some nice syntactic sugar for things that are annoying in git but come up often, like the easy include/exclude of files and being able to switch branches without having to stash or resolve merge conflicts.
I learned so much about the commands that intimidated me by playing "Oh My Git!"[1]. It's an open source game that presents commands as quick-to-understand concepts and lets you experiment with them by using playing cards on the visual rendition of a repository. I was honestly surprised nobody else mentioned it here, maybe it's not that well known.
Of course it's not completely accurate all the time in its descriptions but it sure helps understanding the underlying workings a bit more, and it pushed me into actually reading the manual.
[1] https://ohmygit.org/