Hacker News new | past | comments | ask | show | jobs | submit login

My personal, subjective impression: Commits are getting smaller and smaller nowadays. As in: In the subversion days, many people commited only few times a day, sometimes not for several days. SVN commits of course involved a sync with the server (a "push" in git lingo), and thus usually represented a much larger increment with a substantial change to the code base [X]

With git, it became very common to structure changes to a code base in many, very small commits. Rename a variable? Commit. Write some docs? Commit. Of course, the overall changes when developing a feature did not become smaller, they are now just distributed over many more commits. So I'd argue that a SVN commit was often conceptionally closer to what we now have with a git pull-request.

Why does this matter? Because It is kind of hard and not helping anyone if you describe your renaming of a local variable with an extensive docstring.

What I do miss however, is a good description of the overall change. I.e. now often the description in the merge commit is just the autogenerated message, but this is where I would like people to really take the time and describe the change extensively. This is why I like `--squash` merges, because they let people focus on the relevant parts in their description. I know, rewriting history is bad, but overall, I favour reading a history book than 18th century newspapers.

[X] not saying that there weren't small one-line-change commits, but overall they were rarer.




[the merge commmit message] is where I would like people to really take the time and describe the change extensively

http://lkml.iu.edu/hypermail/linux/kernel/1702.2/03492.html


Never thought of that usage of merge commits. This is a great place to write the couple paragraphs that you might have in a Pull Request, better than squashing IMO.

I've found that for smaller commits, if you have something long you want to explain in the commit message body... you should probably put it in a code comment!

If you don't think it merits a code comment, it's probably not important enough for people to look up the commit message body either (if only because the commit message body is less likely to be seen).


> What I do miss however, is a good description of the overall change.

https://github.com/ribasushi/dbix-class/commit/1cf609901

Something like that I take it? :)


Wow. That took some commitment!


There's actually a paragraph at the end about that too :)


excellent


Changing _public_ history is bad. I don't see any problem with rewriting your _personal_ history before merging it in.


Changing public history is bad, because it makes collaboration and two devs working on one branch harder.

But I do not see a problem with rewriting history on a branch, if (and only if) you kind of know that no one else is pulling the changes. Or, when merging a PR, a rewrite is okay too, if the next feature will be branched off of the trunk, too.

Also, mercurial's tooling seems to help https://www.mercurial-scm.org/wiki/ChangesetEvolution with rewritten history by making it easier to track history rewrites. Basically I think this is a path in version control systems worth exploring.


Not only not a problem, but a must in my book and I'm fairly sure I'm not alone. For me it's like a new workflow which I always wanted but never could have without git. A lot of days for me now consist of creating a lot of small commits and then every couple of hours when a single 'thing' is finished, start an interactive rebase and create a storyline which is easy to read, understand and follow. This can be even one commit sometimes if it makes sense. And in repos I manage myself an if the change spans several days it's usually big and I might create a seperate branch and have a merge commit so it's extra clear all commits belong to feature/xxx.


I find tons of small commits a clutter and waste of time. I don't see any reason for doing so. On the contrary I can see disadvantage - reading and understanding a history later may become difficult task. After all what counts is your full chunk of work, reviewed via pull request, and merged to master. It should be treated as a whole.

Has it really become so common with git? I don't see such trend around me.


>On the contrary I can see disadvantage - reading and understanding a history later may become difficult task.

I'm replying to you but this is directed at everybody who advocates squash merge and discourages small commits.

IMO this is a tooling problem, plain and simple. When I am committing to Git, I am using the "write" components of Git which are incredibly powerful. I can commit in as small a chunk as I want and preserve the richest history of all the small changes I've made, knowing full well that the state of the code at HEAD will not be degraded for doing so. If I make two small independent changes, I can feel free to branch them separately and then merge them together to show that they could have been performed in any order.

When you read my history, you are using the "read" components of Git. Unfortunately these are not as powerful. You can do some nice things, like if you want to treat history as a straight line you can use `git log --first-parent` and you'll see only the merge commits (as if all merges had been squash-rebases).

It would be much better if you were able to collapse or expand any sequence of linear commits to gloss over the lower level details. But as far as I'm concerned, this is a problem with the "read" components of Git, not the "write" components, and so I will continue to use the "write" components to their full power. And the best part is that if I do it this way, we can improve the "read" components and allow the reader to collapse my verbose history, but we will never be able to expand pre-collapsed history.


There is "Collapse Linear Branches" action in Intellij's git log viewer (and I guess any Jetbrains IDE) which does pretty much what you describe :-)


The main reason I request commits to be split up is for ease of code review. It's much easier to review three commits that each do one easily comprehensible small thing than one commit that does three things at once. It's also better if you find there's a bug -- you can bisect down to a commit that's fairly small where the bug should be easy to see, rather than one that's enormous and where the bug is hard to find among all the other changes.


I think it is a matter of definition of "small" and "enormous". If you have a small thing, easily comprehensible, but big enough for it to be a complete piece of work. Then probably you also have separate task for it, and the change you introduce doesn't break the build. So it the end it's just a perfect candidate for pull request.

But note the comment above mentioned a commit for variable change. Or a commit for adding some comment sentence. Nano commits they are.

Sure, tasks should be small, easy to get, easy to review. But there must be a balance. Going to extreme, both ways, doesn't do any good.


indeed, if the commits are individually reviewable it is nicer. To the contrary however often these small commits can be a bit messy. Sometimes you'll find commits that are reverted later on, or fixed up later on. I.e. for commit-level review to work well, it's great if the history was polished.


Small, incremental commits are an asset with git blame, git bisect and git revert. I find it much easier to deal with too many small ones, rather than too few large ones. Especially if you keep the convention that master is always "merged into", i.e. "left of the merge", i.e. "parent 1".


especially with very small commits, I find small commits to be tedious and error prone (sometimes the software doesn't even build because the developer distributed two not-so-independent changes over two commits because the connection wasn't so obvious. Then you have a failed build and you don't really know if `git bisect` just beamed you into the middle of a refactoring, or whether there is an actual issue.


> After all what counts is your full chunk of work, reviewed via pull request, and merged to master. It should be treated as a whole.

I find the PR mechanism works great for the view of the whole, whereas the individual commits are great for the pieces. So in my commit history, you can read the timeline, and then if you want to see the commits squashed down, you click on the individual PR. On the PR screen (assuming you're using GitHub), it has a nice list of the subject lines of each of the individual commits.


Commits can serve as a supplement to documentation. When you properly commit the different logical steps that led to the current state of the code, it becomes incredibly easier for another team member to get why and how you have implemented things a certain way.


Would be interesting if there was a way to annotate a set of commits, like "commit ???? - ????: refactored A,B, and C" so you'd get the advantage of small commits and clearer messages.


This is what PRs are good for. Also, with my particular approach to commits, I always have at least one issue associated to a commit, and I'm always working on a particular branch associated to the issue. I pick an emoji that captures the issue/branch in a single concept, and I have that in my subject line. This is combined with my git commit template mechanism, and I like it. At a glance, I can see which commits belong together, and if I want to look at the whole, I go to the PR.

E.g. https://github.com/ibgib/ibgib/pull/180


neat


I think you can do that in a merge commit, sort of.

The more I think about it, the stranger a strong aversion to rewriting commit history for clarity is. In university if I did some math / physics calculation, I would often start, and once I got somewhere, make a clean copy of the successful work to have a concise and revised version.


I am a firm believer that it's totally fine to rewrite history when working on a private branch that hasn't been pushed.


Mostly fine to do it on a feature/PR branch also, in my opinion. If those become long-lived with multiple people touching them (where history rewrites become peoblematic) you are not integrating continuously enough.


Not pushing private branches is risky though - you have no backup if something happens to your machine.


Unfortunately, I'm guilty of the opposite: I rarely, rarely commit. Maybe one commit per point. I have to consciously remind myself to commit more often.


> As in: In the subversion days, many people commited only few times a day, sometimes not for several days.

This was often the source of merge hell. Half of what makes git merges easier is the smaller commits that it encourages.


But kind of it was also the tooling. Most svn projects I worked on were trunk-based and thus integrated much tighter than git feature-branch based code. However, the times I merged subversion branches, I kind of was sure that subversion lost some changes.


I mean, aren't pull requests basically the solution to that problem?


Not if the merge commits just say `merged branch ....`.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: