Merge Pull Request Considered Harmful

drunken_thor · on June 26, 2014

I really do not see the problem with the rails commit history. Having those merge messages that point back to a pull request leaves lots of documentation about what was done and why it was done. I really do not know why people are so particular about their git log either. It shows an accurate history of the repo, not a revised cleaned up history.

However being able to edit pull request before merging was a good thing to learn. However I think in the long run I would want to learn it with plain old git rather than tack on another tool to my workflow.

ntalbott · on June 26, 2014

So to be clear, every commit in the ActiveMerchant history also points back at the initiating PR via the commit message including the PR number (which gets linked up). So there's no real loss of information in that sense

And I hear your RE adding another tool, as I'm very cautious about that myself. That said, `git am` is "git cannon", and all hub is really adding is the ability for it to slurp Github urls as well as Maildirs.

Neat, related trick: add ".patch" to the end of any PR url on Github and you'll see a formatted patch for the PR. All hub is really doing is slurping that and passing it to git as though it came from a mailing list.

lelf · on June 26, 2014

> So there's no real loss of information in that sense

It is in reality. Try to git bisect a bug. (Yes, I read the part about the bisect in the post. But thanks, I'd rather search the bug in several 10-line patches than in one 1000+).

ntalbott · on June 26, 2014

Most of the contributions I'm merging in are 10 lines, but a lot of them (half?) take two or more (sometimes many more) commits to converge on those 10 lines, and none of those intermediate commits has useful information in them. The other half of contributions are only a single commit already.

If someone contributes a true 1000+ line contribution that I'm even willing to take (yikes!) then you better believe I want that in multiple, logical commits, and it may even be me that ends up doing the splitting.

I'm sure a lot of it just has to do with what types of contributions a given OSS project receives, so YMMV.

npsimons · on June 26, 2014

MMDV, and for my current project we have code reviews that don't pass and merge unless the automated CI (builds, tests, lints, etc) pass, so if your new feature takes 1000 lines of code, so be it. We do try to keep to the minimum necessary change, though, and focus on tight conceptual chunks. But a patch should never break the tests, especially not to keep it below a certain size, otherwise how do you do automated bisects?

IanCal · on June 28, 2014

> otherwise how do you do automated bisects?

Return a 125 from your script if the tests fail.

livingparadox · on June 26, 2014

Since the author of the article decided to invoke Linus, I thought I'd see what Linus thought about the merge vs rebase debate.

http://www.mail-archive.com/dri-devel@lists.sourceforge.net/...

Turns out Linus also agrees with drunken_thor, that merge messages are useful. Suppose that's why Linus added merges in the first place. ;)

chrismorgan · on June 26, 2014

His answer there on this topic boils down to “tell the patch author that they made a mistake, and steadfastly refuse to do anything with it yourself until the patch author does fix it”. This is the whole point of this article—that is simply not pragmatic; you can have changes made that aren’t quite perfect but where the original user is not willing to make the changes you require. It being abandoned in imperfect state, Linus’s answer is actually ignore it.

btilly · on June 26, 2014

Linus' situation is that his ability to handle incoming contributions is a real bottleneck. In that case it is worthwhile to say, "Make my life easy or I ignore you." And because people want their stuff in his project, they do wind up making his life easy.

Most open source maintainers are not in that position.

livingparadox · on June 26, 2014

Yeah, I agree Linus takes things too far at times. My response was mostly because the author's "this is the way Linus would have done it" stuff. I do agree that we should be willing to do a little cleanup ourselves when feasible. But I disagree with the author that we should edit /somebody else's commits/.

ntalbott · on June 26, 2014

Well, quoting from that message, "If it's not ready, you send patches around". I think the big difference is that I'm willing to take patches that aren't ready and get them ready, whereas I'm sure Linus usually pushes back on the contributor to clean up their stuff. And I do think merges are super useful in a lot of cases, but Linus added `git am` too, and I like using all the tools at my disposal when they fit.

agmcleod · on June 26, 2014

Yeah i feel the same way. Never been one to squash my commits, even when i commit something stupid and have to revert it. We have three owners and one collaborator on a javascript project right now, we've been good on keeping up with pull requests, and closing off bugs. I can see where a bigger project like ActiveMerchant could get nuts if only one or two people are handling it.

ForHackernews · on June 26, 2014

Yeah, I found the notion of a commit being "history worthy" kind of silly. If that's how it happened, then it's history! It's not a value judgement.

npsimons · on June 26, 2014

There's an issue here where people are conflating two definitions of "history". Sure, what happened in the real world can't be changed, but why should we be constrained by that in the worlds we construct in software? Just because someone calls a record of development "history" doesn't make it inviolate, and quite frankly, as a maintainer, I don't care about every little sneeze that a developer had on a project. I want each commit to compile and pass tests at a minimum, so that I can do automated git bisects. I would prefer to have a coherent history of well thought out conceptual chunks, so that I can figure out how things are supposed to be, not what some sleep-deprived programmer was thinking at 3AM on a Sunday.

IanCal · on June 27, 2014

> I want each commit to compile and pass tests at a minimum, so that I can do automated git bisects.

What stops you from doing automated git bisects when some commits shouldn't be tested? Returning a 125 from your script if it doesn't compile will skip that commit. When it comes to GH you can skip any commit where the message doesn't start with "Merge pull request #" for example.

npsimons · on June 27, 2014

And how do you know that wasn't the commit where the bug was introduced? You check it by hand? Great, you've now defeated the purpose of git bisect.

IanCal · on June 28, 2014

If you're squashing commits together and only leaving chunks that compile & pass the tests, what are you losing by skipping commits that don't compile & pass the tests?

Can you come up with an concrete example of where this would be a problem? I might be missing something but I can't picture a case where there'd be an issue.

ForHackernews · on June 26, 2014

I guess different people have different workflows. I prefer to make many tiny atomic commits that gives me very granular ability to backtrack if things go wrong.

For "conceptual chunks", I use branches or tags, and don't worry about the individual commits very much.

mindcrime · on June 26, 2014

Same here. I don't generally care how "messy" the git commit log is, but when we reach a point where something is notable, then it's time for a tag/release or whatever.

lmm · on June 26, 2014

There's an argument that bisect is only useful if every commit is broken. If, as is often the case with my history, most commits don't compile, maybe there's an argument for squashing?

IanCal · on June 26, 2014

Would you not just skip those? Skipping non-compiling code is the canonical example of "git bisect skip".

You could also have a little script that does something like:

    # Usage gbisect_prs bad good
    git bisect start
    for commit in "git log --since good --until bad"
        if not commit.message.startswith("Merge pull request #"):
            git bisect skip commit.hash

Turned into real code, obviously, but it'd tell git to ignore any commit that's not a PR merge.

You could also just include this into your bisect script if you're not doing it by hand, return 125 if it's not a commit you want to test.

josephlord · on June 26, 2014

You can always `git bisect skip` when you hit a commit that doesn't build or you can't identify whether the bug of interest is present.

lomnakkus · on June 26, 2014

This works as long as there are only a few commits which don't build. It fails miserably if, say, 25% of all commits don't build (don't ask). In such a situation it becomes increasingly hard to tell what actually broke whatever functionality you're interested in: The commit you end up with or N prior commits which don't build (and which is incidentally usually full of "noise").

IanCal · on June 27, 2014

Why does it make it harder?

> The commit you end up with or N prior commits which don't build (and which is incidentally usually full of "noise").

The alternative, where things are squashed together, would leave you with just one commit but it'd be those N combined together (at least, it could well be more).

josephlord · on June 26, 2014

You can still diff between last good and first known bad. If necessary you can branch and squash the non builders to see what happened. At worst the intermediate commits gives you the same problem you would have if you had squashed to start with.

lmm · on June 28, 2014

> At worst the intermediate commits gives you the same problem you would have if you had squashed to start with.

Well, if you're going to be bisecting a lot or compilation takes a while, it saves you time to do the squash now (when you know which commits are supposed to compile and which aren't) rather than when you come back to it.

warfangle · on June 26, 2014

It might make sense to squash so you only have compile-ready commits in your log.

In a sense I work this way, without the squashing. When I write a test, the build fails and nothing has been done on the application codebase so I don't commit. When I fix the build by writing new code, I commit.

Maybe I should be doing small intermediate commits and squashing the commits into one commit.

lomnakkus · on June 26, 2014

> Maybe I should be doing small intermediate commits and squashing the commits into one commit.

I would recommend trying that, or at least something similar: Try committing regularly (this is useful if you ever want to go back to a prior state while working or e.g. pick up where you left off on a different machine, etc), but reorganizing and cleaning up your topic branch (via rebase -i, etc.) into a few commits before merging (fast forward). What is "a few commits"? Well, it's a bit of a judgement call, but I just group stuff into logicially coherent units which make sense as single units of functionality. Of course you must make sure each individual commit at least builds sensibly and preferably that all tests run too.

warfangle · on June 26, 2014

Yeah, I've been using 'have I added things, and do my tests pass' as a metric for when I should commit - I'm pretty strictly TDD these days.

ntalbott · on June 26, 2014

I think a lot of it depends on context. In a private repo I'm right there with you. But when I look at a public repo with a lot of contributors, I want to know what the "logical" history is a lot more than I want to know about the three steps it took to get a feature ready for public consumption.

Alupis · on June 26, 2014

one of the points of Git was short and small commits, often.

it helps prevent a SVN/CVS-style workflow, which will negate many of the other benefits of Git IMHO.

npsimons · on June 26, 2014

And one of the major benefits of git is that you can do short, small commits often, to your private repo. When your fix or feature add is ready, you can squash them into a more coherent commit that builds and passes tests (so that it doesn't break bisect) and push it to the public repo.

IanCal · on June 27, 2014

I don't understand the bisect argument, maybe I'm missing something but can't you just skip everything that doesn't build and pass the tests? Returning a 125 from your script automatically skips the commit.

npsimons · on June 27, 2014

And how do you know that wasn't the commit you were looking for? The whole point of bisect is that you don't have to do a manual bug hunt; you can create a script that tests for the thing you are looking for. If it's something static sure, that's easy. But if it's something more complicated and you don't know what's causing it (a prime candidate for bisect), you will need to build and analyze the behavior of the built executable. Non-building commit = more manual bughunting.

IanCal · on June 27, 2014

But aren't we talking about squashing those commits? You're not going to lose anything by skipping commits between PRs instead of squashing everything between PRs into one commit, you can only narrow it down more.

> you can create a script that tests for the thing you are looking for.

Yes, which you can tell to skip anything that doesn't compile/pass tests. Return 125 from your script and it'll be skipped.

> Non-building commit = more manual bughunting.

But you can just do a loop of "make || git skip" and skip everything that doesn't commit, surely. Or if you're using a script, just return 125 for things you want to skip.

MaikuMori · on June 26, 2014

Is this exactly situation that https://help.github.com/articles/checking-out-pull-requests-... documents?

So basically:

    git fetch origin pull/ID/head:BRANCHNAME
    git checkout BRANCHNAME

And now you can edit that pull request and/or make new pull request based on the previous one. Or just simply merge in master.

jakub_g · on June 26, 2014

It's possible to configure it to have all PRs automatically fetched when you use `git fetch origin`:

https://gist.github.com/piscisaureus/3342247

and then you'll have each PR available under sth like

  git checkout pr/123

ntalbott · on June 26, 2014

Cool, I'll have to try that! Does it also pull PR's that live on feature branches in forks of the root repo?

jakub_g · on June 26, 2014

It fetches all the PRs. Once you open a PR for a project (even from a fork), github creates a local branch for that PR in the upstream project.

Note though that doing a fetch will take some time the first time after you set up this - it will fetch all the historical PRs.

watson · on June 26, 2014

This also eliminates the use of the "Merge pull request" button which removes all the pull-request-merged-commits.

I don't know about you guys, but I have not found them useful? Do you use the merge commits for anything?

johnkeeping · on June 26, 2014

I always use "--no-ff" when merging topic branches to master. The advantage is that you can keep individual commits on the topic branch readable and still get an overview of the changes between two releases with "git log --first-parent".

You can also see the set of changes in a topic merge easily with "git log ${merge}^2..${merge}^1" whereas if you use a fast-forward merge it's not at all obvious which sets of commits are related.

aaronharnly · on June 26, 2014

We use the merge commits to trace back a given deploy or machine image to the pull request that caused it. So from a given deployed AMI, a Jenkins job can point to the pull request, and hence the full context of the feature or bug description, the back-and-forth of code review, etc.

andrewchilds · on June 26, 2014

The problem is that pull requests are one-way, read-only collaborations. If I want to make changes, I have to close the contributor's PR and open my own. With write access, I could do:

    git fetch origin pull/ID/head:BRANCHNAME
    git checkout BRANCHNAME
    // make changes
    git push origin pull/ID/head

eLobato · on June 26, 2014

You can always send a PR to the original sender PR branch.

ntalbott · on June 26, 2014

OP author here...

The result is the same, but I've found the `git am` workflow to be much smoother vs. mucking around with remotes. Some of that could be due to the nature of the OSS project I work on - ActiveMerchant - since just in the last month we've had 30+ unique contributors, and most of them have contributed a single change.

My general recommendation is to make sure you try out the `git am` flow for a bit, but then just do what works best for you.

peff · on June 26, 2014

Keep in mind that `git am` is not identical to pulling the commits. You might be applying the commits on a different base, and certainly your committer field will be different (not to mention any substantive changes you make). As a result, the sha1 of the commits you create will be different than those of the original submitter.

This makes life harder for the submitter, because they cannot ask a simple question: were my commits merged into the upstream repository? Because no, they weren't; but commits that are the moral equivalent were. Usually `git log --cherry-pick` can correlate the two, but not always.

We do use `git am` upstream when working on git itself, for the reasons you indicate in the article (plus we like mailing-list based review). But it does come at a cost in managing the various versions of patches.

tytso · on June 26, 2014

This is why it is a good idea for proposed patches to be sent to a mailing list, where other people other than the repo owner can more easily review patches, and then to have those patches tracked via patchwork.

And when I accept a patch, even if I've had to edit the commit summary so it is parsable english (not all of my contributors speak english as a first language), I send a reply via e-mail saying "Thanks, applied".

ntalbott · on June 26, 2014

Agreed, editing contributions does add a bit of extra work for contributors post-inclusion, especially if they've been using their own branch in production. That said, since this is a PR flow, the contributor will get an email from Github that their PR has been closed, and they can track directly from that PR to the closing commit.

While I've often wished that Github had a proper mailing list per repository that could be used to discuss changes, I have to say that one awesome thing about PR's is how nice they make reviewing a patch. The diff view is just awesome, so now that I have a `git am` style flow with them it would be hard for me to switch to just passing patches around on a mailing list. That said, a mailing list would still be a welcome adjunct to the PR flow if done right.

0xbadcafebee · on June 26, 2014

Well the commits may be changed during pulling if the maintainer deems it necessary (or a merge just ain't happening cleanly) so it seems like the solution would be to just include the SHAs that were merged in the 'git am' commits. Then the submitter knows what they intended to get pulled got in, which should resolve patch verification, right?

lotyrin · on June 26, 2014

I think maybe you overlooked the grandparent's suggestion; you don't have to mess with adding contributors' remotes. Open pull request heads are available on your existing remote.

shadowmint · on June 26, 2014

Hm... this seems like a very complicated way of saying that github should have a way to merge pull requests into a new branch.

...but it doesn't so you have to:

    - checkout a local copy
    - add a remote to the PR
    - checkout a new branch
    - merge the PR into your local branch
    - fix code, merge to master

Which is entirely true; it is annoying.

The simple solution, though, is to require pull requests to come in a feature branch, and flat out reject any that target master. /shrug

dr4g0n · on June 26, 2014

You can create a new branch containing the pull request fairly easily.[1]

It's just: git fetch origin pull/ID/head:BRANCHNAME git checkout BRANCHNAME

Edit: I see that MaikuMori[2] posted the same information just before me; ah well.

[1]: https://help.github.com/articles/checking-out-pull-requests-...

[2]: https://news.ycombinator.com/item?id=7949107

fidlefodl · on June 27, 2014

I appreciate it! I had no idea that this was possible.

eric_bullington · on June 26, 2014

>The simple solution, though, is to require pull requests to come in a feature branch, and flat out reject any that target master. /shrug

Agreed, and in my experience, every major open source project I'm familiar with requires pull requests to feature branches. In fact, most small projects use the same workflow.

Is this not the case with most open source projects?

snicker · on June 26, 2014

...and I believe the key word here is 'major'.

There are a lot of small, one-off, often very useful utilities that people are now sharing with each other via Github (I'm guilty/a participant in this phenomenon), and many noble users of these utilities want to help, contribute, and send PRs... a non-negligible number of them new to Git.

So, is it more of a PITA to set up a feature branch for a single python script and instruct users in your CONTRIBUTING file to 'make sure they submit PRs to branch XYZ!' or just deal with the odd occasional PR to master? Folks new to Git will probably just send a PR to master anyway (I believe OP addresses the 'new user' issue as well, having to explain Git commands to users in comments on a PR)

That all being said, I still typically follow the workflow shadowmint outlines above.

ntalbott · on June 26, 2014

Using git remotes and distinct branches per contribution is totally legit, and I still do it sometimes. But, I'm often/usually dealing with contributions that have already been reviewed and don't need a whole feature branch/review within the root repository before inclusion. And for that setup - where it's almost ready - it's so much easier to just `git am` it into master, make necessary tweaks, and push.

In general I'd just encourage maintainers to try the `git am` flow, especially on small to medium complexity contributions where it looks mostly ready to go and you don't want to do another week of ping-pong with the contributor just to get a variable renamed or some whitespace fixed.

As I tell my kids, "Just try one bite of <food I find delicious>. If you don't like it, that's cool, then it's more for me!" :-)

eevilspock · on June 26, 2014

What about submitting a PR with your cleanup to the original PR's branch? That person reviews and accepts, then submits a new PR back to you.

I haven't learned git yet so maybe my assumption that the original PR is on a branch/repo clone to which you can submit a PR is incorrect?

jessaustin · on June 26, 2014

You totally could do that, but it doesn't really help with the lack of interest problem he described. If the contributor can't be arsed to fix their patch, can we expect them to be arsed to merge a patch to their patch and then re-PR it?

gknoy · on June 26, 2014

Since PRs on Github auto-update when the branch they are based on is updated, they would not need to create a new pull request:

Bob requests that Alice merge bob/bob-feature into alice/master.

Alice requests that Bob merge alice/fix-bob-to-conform-to-pep8 into bob/bob-feature.

Bob merges that into bob/bob-feature, and Bob's PR into Alice's repo is now auto-updated to reflect the changes that he merged into his branch.

Alice is now satisfied, and merges bob/bob-feature with alice/master.

jessaustin · on June 26, 2014

Good point; thanks for the info! Under the hypothesis that Bob is lazy, however, he still might not perform that merge...

russelluresti · on June 26, 2014

Agreed. Having a pull request go to a feature branch (or bug branch, or whatever your pull request is aiming to do) is the best way to go about it.

The issue described by the author is definitely more of a workflow problem than a technology or service problem.

thegeomaster · on June 26, 2014

Torvalds himself doesn't like GitHub's pull request workflow for several reasons and doesn't accept pull requests on GH for the Linux kernel, see [1].

[1]: https://github.com/torvalds/linux/pull/17#issuecomment-56546...

watwut · on June 26, 2014

He seems to be complaining primary about quality of commit messages, missing emails, commit sign offs an things like that. It is more about bureaucracy (which is arguably important on project if linux size) then about workflow itself.

tytso · on June 26, 2014

It's about the patches not being fit for merging. It might be because of the commit sign-offs, or it might be because it was missing error checking, or any number of things. The real question is which incurs more overhead --- asking the contributor to fix it, and then needing to wonder when/if the contributor will fix it, or to just make the d*mned changes yourself, while preserving bisectability.

Another very common one isn't about bureaucracy, but making the commit messages readable --- including the stack trace of the faliure you are fixable (which again is important for non-toy projects that are being distributed in other products, and where people may cherry-pick fixes into a stable branch used for release purposes). Or if you have contributors from around the world whose native language is not English, and you want to make it easier for other people to understand what the heck is going on without having to read the contents of each and every diff.

This is fundamental to project health, which means that a proper workflow is fundamental to project health. Given that the vast majority of github repos are toy-sized, or end up being abandoned, maybe that's fine for github. But for any project where I have hopes that it will turn into something real (and if I don't have that hope, why would I waste time on it?), I'm not going to accept pull requests from git hub.

It is not just going to happen.

ulisesrmzroche · on June 26, 2014

That's cuz Linus Torvalds doesn't have much of a bedside manner. I agree with ya'll on the quality commit message stuff, but that last third was all conjecture from your part, homey.

tytso · on June 26, 2014

The last third is based on my extensive opinion having worked on open source software for my entire career, including being the ext4 subsystem maintainer and the e2fsprogs author and maintainer.

thegeomaster · on June 26, 2014

ntalbott · on June 26, 2014

These quality issues become a problem sooner than I would have thought, and Linux is big enough that they can just demand contributors know their stuff. But I found that in between "hobby project of my own" and "huge project where people beg me to contribute", there's a big space of "big enough that it gets contributions, but small enough that it's not worth turning contributors away for small quality issues". And I love that hub + `git am` lets me take those contributions but still maintain the workflow I want on the project.

owenversteeg · on June 26, 2014

You know, when I started reading this I thought that I would disagree with you and write a grumpy comment to that effect. ("Considered Harmful" usually makes me grumpy.) However, you convinced me otherwise. Congratulations on a good "Considered Harmful" piece!

Here's another example: I recently made a PR to a project to fix a broken URL. I changed a wrong file and forgot about the PR. The maintainer had to close the PR, make the change himself, and explain why/what he did. This whole process would've been a lot simpler if Github allowed people to merge to another branch.

IanCal · on June 27, 2014

> This whole process would've been a lot simpler if Github allowed people to merge to another branch.

Is this a problem that comes from forking? I raise PRs on my projects onto various different branches all the time. Or maybe I've misunderstood the issue.

watson · on June 26, 2014

I just tried this on an open pull request we had. Pretty ok experience. For some reason though, there where a merge conflict when running the "hub am -3 <url>" command. In my case was easy to fix, but Github reported the PR to be mergeable, so maybe Github uses a different merge strategy for PR's than "hub am" out of the box?

davedx · on June 26, 2014

Yeah, I don't get this hub thing. When I want to manually merge on a Github project I just follow the GH instructions on the PR page to manually checkout the fork, do my work there, and merge it locally then push. Why do you need another tool to do this?

watson · on June 26, 2014

As far as I know hub is more than just a command line tool for pull requests. It extends git with a lot of Github nice-ness. But in the end some people just like CLI's better - and in this case it exposes some features that's not directly available in the UI.

ntalbott · on June 26, 2014

FWIW hub is even handy for the "add a remote for the fork, check out the fork branch" flow, since it lets you just `git remote add ntalbott` instead of having to find or derive the full url for the remote.

masklinn · on June 26, 2014

hub is just a bunch of shortcuts for operating on github repositories, rather than do it all by hand operation by operation.

ntalbott · on June 26, 2014

As I understand it `git am` is actually replaying the series of commits on top of the branch you run it in, whereas a merge commit flattens the commits out and applies them in one go (while keeping full history of the commits begin merged). So it's definitely a different strategy, and you can get conflicts where you wouldn't with a merge commit. As you said, though, they're super easy to fix when they happen. Often when I see a conflict like this it's because there are 52 (OK, exaggerating, a little) commits in the contribution and using the `git am` flow is a huge win since I'm now in a good spot to squash the down to one or two commits.

watson · on June 26, 2014

Just a note to people who want to try this: According to the post you should use the command "git am -3 <url>", this seems to be a typo and should be: "hub am -3 <url>"

ytjohn · on June 26, 2014

Author didn't mention it, but in the hub documents they instruct you to `alias git=hub`. So when author is running git, he's really running hub.

watson · on June 26, 2014

Ah, good point. I've been running hub for some time (love "hub pull-request"), but yes I never did set it up as an alias for some reason

nirvdrum · on June 26, 2014

I'm not sure how I feel about someone writing something and making me the author. I guess attribution makes sense in the form of paraphrasing. But I tend to think of commits as literal quotations.

jessaustin · on June 26, 2014

I see what you mean, and it almost seems like the "sin" isn't the additional editing, it's the subsequent rebase. It's one thing to require contributors to rebase, but it seems like another to rebase for them. Maybe I'm missing something, but based on the "History Is Written By the Victors" section of TFA I would have been surprised to see anyone other than @ntalbott at the top of https://github.com/Shopify/active_merchant/graphs/contributo...

ntalbott · on June 26, 2014

You may think of commits as literal quotations of the "Author", but it's important to realize that git doesn't. If you want to know who's ultimately responsible for a commit, you always want to look at the "Committer" since that's who actually signed on the dotted line so to speak.

nirvdrum · on June 28, 2014

I'm not concerned with who's ultimately responsible for the commit. I'm concerned with being called the author on something I didn't write. These distinctions are not made or enforced by git. At the end of the day, it's how humans interpret two pieces of metadata.

The first hit on Google and Bing for "git committer vs author" brings up this SO entry:

http://stackoverflow.com/questions/18750808/difference-betwe...

So, I don't think I'm taking some fringe view here. For my part, I don't think I want someone else writing code and putting my name on it.

eevilspock · on June 26, 2014

see ntalbott's comment: https://news.ycombinator.com/item?id=7949456

joshribakoff · on June 26, 2014

I've always just pulled in the contributor's branch, made additional commits to fix things up, and squashed all the commits before pushing up to GitHub. Not very difficult

gleenn · on June 26, 2014

Interesting problem for committers. Since Github made the hub tool, maybe they could make this person's flow better right from the web UI somehow.

ntalbott · on June 26, 2014

Maybe, but... probably not. The thing is, I want to work with the changes locally before they get included. So it can't be a 100% web workflow. That said, there might be ways the web UI could make the transition to the CLI easier, and it would be great if Github promoted the `git am` flow on the PR pages as well.

reledi · on June 26, 2014

What works best for me is to use a combination of the Pull Request button and to manually merge or rebase changes when necessary.

I used to strictly stay away from the Pull Request button, because it made my history "messy". Now I care less about that and more about the convenience (when it's appropriate).

fra · on June 26, 2014

I wish github would either detect when a branch is merged back into master and automatically close the PR, or allow for some more complex merge operation. Squash merge would be particularly helpful.

Siecje · on June 26, 2014

You should be able to update your fork to be the same as upstream, with one click on your fork.

ricket · on June 26, 2014

"And as a hypothetical story-telling construct I created in my mind, Jane agrees 100%!"

That's a very strange way to end an article. Could have just stopped before adding that last line.

aut0mat0n1c · on June 26, 2014

Using the term 'considered harmful' in a blog post title considered harmful.

SimeVidas · on June 26, 2014

I love the approach but hub seems to be Mac only :(

nfm · on June 26, 2014

It's just a ruby script, and it works on Linux/Mac/Windows.

warfangle · on June 26, 2014

It doesn't build properly on Linux?

leccine · on June 26, 2014

"If the contribution is a single commit, I’ll do a git commit --amend and just smoosh my changes right into the original commit."

that is right, editing history is the way to go, or is it? :)

grogenaut · on June 26, 2014

Considered Harmful blog posts Considered Harmful

ntalbott · on June 26, 2014

You forgot to reference the great recursive essay Eric Meyer wrote on exactly this topic:

http://meyerweb.com/eric/comment/chech.html

Which essay I already knew about and chose to willfully ignore when I titled the OP. We could chat over beers sometime as to whether I'm a bad person for doing so :-)

CtrlAltDel · on June 26, 2014

We used to be elitist about our choice of language. We've evolved to become elitist about how we use source control.

thinkt4nk · on June 26, 2014

harmful?

rjbwork · on June 26, 2014

It's a play on the classic Dijkstra letter, "Goto Considered Harmful." Also yes, I think it's harmful. A co-worker and I wanted to use a new up and coming FOSS project that's hosted on GitHub, but we needed a certain killer feature. Said co-worker worked for about 4 weekends in a row, and fully implemented it, with nice abstraction and separation of concerns. The code was then turned down because it was "too seperated, and unnessecarily abstracted" which we disagreed with. Thus the pull request, with an amazing feature, sits languishing because he nor I is going to spend another few weekends working on a project that won't integrate new features unless they're perfect and conform to the original author's idea of "perfect code".

calinet6 · on June 26, 2014

So what you have is a fork, with a fully implemented feature, that didn't make it back upstream.

This happens all the time, and is the nature of open source. No project is obligated to take your contribution, and that happens for a myriad of reasons that you are not obligated to understand.

It's fundamentally a human problem, not a software one. There's nothing wrong with the software here—it does everything required and more to be able to manage the codebase.

Take your fork, explain the situation in the README and let it be. It'll be in the fork graph and the list on Github. People can find it. If they like your feature, maybe they'll use yours instead. If enough people get behind it and ask for it, or say "hey this was in a PR, why wasn't it merged?", the pressure could be enough for the original author to just accept the PR.

In any case, this is not a problem with the pull request feature. It's a simple collaboration problem, and a PR is only one method of communication you can use to solve it. If you give up after throwing a PR into the void (not that I'm saying you did), you shouldn't expect instant success.

al2o3cr · on June 26, 2014

Translation: "My co-worker wrote a clusterfuck of indirection spaghetti and the open-source volunteer refused to maintain it!"

Every PR that adds a feature adds future support and maintenance effort. If you and your buddy are unwilling to spend the time to get it right, why are you expecting the maintainer to spend the time to support it?

rjbwork · on June 26, 2014

In short, it was a perfect implementation of adding support for dynamically updating scripts via NuGet for a C# port of HuBot by typing "<robotName> update-scripts". This is the standard way of managing dependencies, and would mean not having to manually upload scripts to a folder restart the bot every time you need to add or change the scripts.

The maintainer shortly thereafter attempted to do it himself his way, ended up with a bug-ridden and memory-leaking implementation that barely worked, and sits languishing in a lonely branch.

Don't make assumptions about the code you didn't see and the attitudes involved coders might have. We would have been (and still would be) more than happy to help maintain it had the bot turned out useful for us, but because it was a new project and the structure was changing constantly, attempting to fork and integrate upstream changes would have been a nightmare, so we decided to go a different direction.

rlpb · on June 26, 2014

The right way to do this is to talk to the maintainer about how he wants the feature implemented first. Especially if it involves four weekends of work.

aResponder51 · on June 26, 2014

Depends on the size of the feature, and the maintainer, but you're right about "talk to the maintainter."

Some projects, you'll want to do a weekend of really rough work, contact the maintainer, get their thoughts. Others, you'll want to do 8 days work, get it far enough along to show, get their thoughts.

Often though, before you write a line of code, talk to them.

rjbwork · on June 26, 2014

We did, he explained it, then implemented it himself, and it was a bug-ridden, memory-leaking piece of shit that isn't merged in either.

MichaelGG · on June 26, 2014

I was once on the other side of this. A nice guy wrote a whole ton of code for a module, refactored things to the max, made it "testable" and a bunch of other things. He was very nice, very enthusiastic, and really wanted to help.

But I had to say no because it didn't fit well within the overall project, and didn't really add any value but a lot of complexity. I felt very bad to turn him down but otherwise I'd be now responsible for code I have no interest in and code I don't feel helps users.

The right way is to talk to the maintainer and agree on what to do first. Or be prepared to run a fork.

unwind · on June 26, 2014

It's a meme, sort of: http://en.wikipedia.org/wiki/Considered_harmful.

rrss1122 · on June 26, 2014

Like all memes, it is mindlessly implemented to look "cool" or "clever".

McGuffin · on June 26, 2014

A more precise term would probably be snowclone.