I really do not see the problem with the rails commit history. Having those merge messages that point back to a pull request leaves lots of documentation about what was done and why it was done. I really do not know why people are so particular about their git log either. It shows an accurate history of the repo, not a revised cleaned up history.
However being able to edit pull request before merging was a good thing to learn. However I think in the long run I would want to learn it with plain old git rather than tack on another tool to my workflow.
So to be clear, every commit in the ActiveMerchant history also points back at the initiating PR via the commit message including the PR number (which gets linked up). So there's no real loss of information in that sense
And I hear your RE adding another tool, as I'm very cautious about that myself. That said, `git am` is "git cannon", and all hub is really adding is the ability for it to slurp Github urls as well as Maildirs.
Neat, related trick: add ".patch" to the end of any PR url on Github and you'll see a formatted patch for the PR. All hub is really doing is slurping that and passing it to git as though it came from a mailing list.
> So there's no real loss of information in that sense
It is in reality. Try to git bisect a bug. (Yes, I read the part about the bisect in the post. But thanks, I'd rather search the bug in several 10-line patches than in one 1000+).
Most of the contributions I'm merging in are 10 lines, but a lot of them (half?) take two or more (sometimes many more) commits to converge on those 10 lines, and none of those intermediate commits has useful information in them. The other half of contributions are only a single commit already.
If someone contributes a true 1000+ line contribution that I'm even willing to take (yikes!) then you better believe I want that in multiple, logical commits, and it may even be me that ends up doing the splitting.
I'm sure a lot of it just has to do with what types of contributions a given OSS project receives, so YMMV.
MMDV, and for my current project we have code reviews that don't pass and merge unless the automated CI (builds, tests, lints, etc) pass, so if your new feature takes 1000 lines of code, so be it. We do try to keep to the minimum necessary change, though, and focus on tight conceptual chunks. But a patch should never break the tests, especially not to keep it below a certain size, otherwise how do you do automated bisects?
His answer there on this topic boils down to “tell the patch author that they made a mistake, and steadfastly refuse to do anything with it yourself until the patch author does fix it”. This is the whole point of this article—that is simply not pragmatic; you can have changes made that aren’t quite perfect but where the original user is not willing to make the changes you require. It being abandoned in imperfect state, Linus’s answer is actually ignore it.
Linus' situation is that his ability to handle incoming contributions is a real bottleneck. In that case it is worthwhile to say, "Make my life easy or I ignore you." And because people want their stuff in his project, they do wind up making his life easy.
Most open source maintainers are not in that position.
Yeah, I agree Linus takes things too far at times. My response was mostly because the author's "this is the way Linus would have done it" stuff. I do agree that we should be willing to do a little cleanup ourselves when feasible. But I disagree with the author that we should edit /somebody else's commits/.
Well, quoting from that message, "If it's not ready, you send patches around". I think the big difference is that I'm willing to take patches that aren't ready and get them ready, whereas I'm sure Linus usually pushes back on the contributor to clean up their stuff. And I do think merges are super useful in a lot of cases, but Linus added `git am` too, and I like using all the tools at my disposal when they fit.
Yeah i feel the same way. Never been one to squash my commits, even when i commit something stupid and have to revert it. We have three owners and one collaborator on a javascript project right now, we've been good on keeping up with pull requests, and closing off bugs. I can see where a bigger project like ActiveMerchant could get nuts if only one or two people are handling it.
There's an issue here where people are conflating two definitions of "history". Sure, what happened in the real world can't be changed, but why should we be constrained by that in the worlds we construct in software? Just because someone calls a record of development "history" doesn't make it inviolate, and quite frankly, as a maintainer, I don't care about every little sneeze that a developer had on a project. I want each commit to compile and pass tests at a minimum, so that I can do automated git bisects. I would prefer to have a coherent history of well thought out conceptual chunks, so that I can figure out how things are supposed to be, not what some sleep-deprived programmer was thinking at 3AM on a Sunday.
> I want each commit to compile and pass tests at a minimum, so that I can do automated git bisects.
What stops you from doing automated git bisects when some commits shouldn't be tested? Returning a 125 from your script if it doesn't compile will skip that commit. When it comes to GH you can skip any commit where the message doesn't start with "Merge pull request #" for example.
If you're squashing commits together and only leaving chunks that compile & pass the tests, what are you losing by skipping commits that don't compile & pass the tests?
Can you come up with an concrete example of where this would be a problem? I might be missing something but I can't picture a case where there'd be an issue.
I guess different people have different workflows. I prefer to make many tiny atomic commits that gives me very granular ability to backtrack if things go wrong.
For "conceptual chunks", I use branches or tags, and don't worry about the individual commits very much.
Same here. I don't generally care how "messy" the git commit log is, but when we reach a point where something is notable, then it's time for a tag/release or whatever.
There's an argument that bisect is only useful if every commit is broken. If, as is often the case with my history, most commits don't compile, maybe there's an argument for squashing?
Would you not just skip those? Skipping non-compiling code is the canonical example of "git bisect skip".
You could also have a little script that does something like:
# Usage gbisect_prs bad good
git bisect start
for commit in "git log --since good --until bad"
if not commit.message.startswith("Merge pull request #"):
git bisect skip commit.hash
Turned into real code, obviously, but it'd tell git to ignore any commit that's not a PR merge.
You could also just include this into your bisect script if you're not doing it by hand, return 125 if it's not a commit you want to test.
This works as long as there are only a few commits which don't build. It fails miserably if, say, 25% of all commits don't build (don't ask). In such a situation it becomes increasingly hard to tell what actually broke whatever functionality you're interested in: The commit you end up with or N prior commits which don't build (and which is incidentally usually full of "noise").
> The commit you end up with or N prior commits which don't build (and which is incidentally usually full of "noise").
The alternative, where things are squashed together, would leave you with just one commit but it'd be those N combined together (at least, it could well be more).
You can still diff between last good and first known bad. If necessary you can branch and squash the non builders to see what happened. At worst the intermediate commits gives you the same problem you would have if you had squashed to start with.
> At worst the intermediate commits gives you the same problem you would have if you had squashed to start with.
Well, if you're going to be bisecting a lot or compilation takes a while, it saves you time to do the squash now (when you know which commits are supposed to compile and which aren't) rather than when you come back to it.
It might make sense to squash so you only have compile-ready commits in your log.
In a sense I work this way, without the squashing. When I write a test, the build fails and nothing has been done on the application codebase so I don't commit. When I fix the build by writing new code, I commit.
Maybe I should be doing small intermediate commits and squashing the commits into one commit.
> Maybe I should be doing small intermediate commits and squashing the commits into one commit.
I would recommend trying that, or at least something similar: Try committing regularly (this is useful if you ever want to go back to a prior state while working or e.g. pick up where you left off on a different machine, etc), but reorganizing and cleaning up your topic branch (via rebase -i, etc.) into a few commits before merging (fast forward). What is "a few commits"? Well, it's a bit of a judgement call, but I just group stuff into logicially coherent units which make sense as single units of functionality. Of course you must make sure each individual commit at least builds sensibly and preferably that all tests run too.
I think a lot of it depends on context. In a private repo I'm right there with you. But when I look at a public repo with a lot of contributors, I want to know what the "logical" history is a lot more than I want to know about the three steps it took to get a feature ready for public consumption.
And one of the major benefits of git is that you can do short, small commits often, to your private repo. When your fix or feature add is ready, you can squash them into a more coherent commit that builds and passes tests (so that it doesn't break bisect) and push it to the public repo.
I don't understand the bisect argument, maybe I'm missing something but can't you just skip everything that doesn't build and pass the tests? Returning a 125 from your script automatically skips the commit.
And how do you know that wasn't the commit you were looking for? The whole point of bisect is that you don't have to do a manual bug hunt; you can create a script that tests for the thing you are looking for. If it's something static sure, that's easy. But if it's something more complicated and you don't know what's causing it (a prime candidate for bisect), you will need to build and analyze the behavior of the built executable. Non-building commit = more manual bughunting.
But aren't we talking about squashing those commits? You're not going to lose anything by skipping commits between PRs instead of squashing everything between PRs into one commit, you can only narrow it down more.
> you can create a script that tests for the thing you are looking for.
Yes, which you can tell to skip anything that doesn't compile/pass tests. Return 125 from your script and it'll be skipped.
> Non-building commit = more manual bughunting.
But you can just do a loop of "make || git skip" and skip everything that doesn't commit, surely. Or if you're using a script, just return 125 for things you want to skip.
However being able to edit pull request before merging was a good thing to learn. However I think in the long run I would want to learn it with plain old git rather than tack on another tool to my workflow.