Linus: please write good git commit messages

timknauf · on Nov 27, 2011

I'm curious about this:

please do proper word-wrap and keep columns shorter than about 74 characters or so

How come word-wrapping is left as a task for humans here? Is there a technical/stylistic/cultural reason why lines can't be wrapped automatically to any desired width by the log presentation layer?

JoshTriplett · on Nov 28, 2011

Because doing so would then require extra complexity to provide a syntax for lines that should not get wrapped, such as code, transcripts, tables, ASCII art...

Hard-wrapping paragraphs to a desired length in the original log allows humans to decide which lines should wrap and which ones shouldn't, without a pile of complexity similar to HTML, Markdown, or some other markup language.

codex · on Nov 28, 2011

Couldn't our woes be solved with something as simple as soft-wrap with a tweak:

soft-wrap:

- to allow the presentation layer to soft-wrap a line, just don't add a newline.

- for new paragraphs, just add two newlines as usual

- for indent sensitive code, diagrams, etc. add newlines where appropriate, taking care to not exceed 74 columns for any line. Everyone does this already.

+ tweak:

- in the presentation layer, soft-wrap all lines except those less than 74 characters long.

This isn't perfect for terminals less than 80 characters wide, in that the diagrams won't fit, but nobody uses narrow terminals, and Linus' scheme is even more broken for this case, so it's strictly better. It also allows one to easily distinguish diagrams from non diagrams by eyeballing the text flow. Best of all, it makes fewer assumptions about terminal width.

JoshTriplett · on Nov 28, 2011

Text that shouldn't wrap can exceed 74 characters. Some examples: log messages, transcripts, large tables. So, no, that wouldn't suffice.

timknauf · on Nov 28, 2011

But how are those log messages, transcripts and large tables all that much better off under the current system? If a console is only 80 characters wide, they're still going to either disappear off the right edge or (as with most consoles I can think of) get soft-wrapped by the console anyway.

atomicdog · on Nov 28, 2011

>would then require extra complexity to provide a syntax for lines that should not get wrapped,

And what's stopping someone implementing this?

JoshTriplett · on Nov 28, 2011

> And what's stopping someone implementing this?

Most likely, not wanting to inflict such a syntax on git users, when the current approach works just fine.

But nothing stops someone from writing a patch with an off-by-default option for such a syntax, and proposing it for inclusion.

jacknagel · on Nov 27, 2011

git-log and friends indent the commit message by four spaces on the left, so wrapping at ~72 chars gives it symmetry on 80 column terminals. By wrapping it yourself, you decide where line breaks should be, not the presentation machinery.

The optimal human-readable line length is something like 66 characters. It's much easier to quickly scan a log message that's 72 characters wide vs. one that is 200 characters wide.

harlanlewis · on Nov 27, 2011

Tangent time!

"The optimal human-readable line length" isn't really something that exists, even if we make lots of assumptions about basic stuff like font size, avg word length, and color contrast.

Here's a quick study on the reading speed and comprehension of character line lengths (cl) between 35 and 95 for reference: http://psychology.wichita.edu/surl/usabilitynews/72/LineLeng...

- most efficient reading (speed/accuracy) at 95cl

- line length does not affect comprehension

But here's the head spinner:

- 60% _preferred_ either 35cl or 95cl

- 100% _least preferred_ 35cl (45%) or 95cl (55%)

Line length is an easy thing to assume everyone perceives the same way, but even given a consistent environment (definitely not a given in this age of increasingly diverse device dimensions) there's a wide range of preferences with little real impact on readability. Unless, of course, your assumptions about readability clash with the user's preferences or reading environment.

Put simply - make your life easy and reduce problems by just letting the user decide.

njonsson · on Nov 27, 2011

Making semantic line breaks (such as for bullets) is understandable, but it still feels wrong that we’re otherwise doing the job of `fold`. I started breaking lines in commit messages because it’s convention, but I still believe this is a tooling issue.

This shell script wraps wide commit messages to the width of your console.

    GIT_PAGER="fold -s -w`stty size | awk '{print $2}'` | less" git log $@

wnoise · on Nov 27, 2011

You might be interested in the program "par", which uses dynamic programming to optimize the line breaks, and can automatically handle such things as text prefixed by "> ".

http://www.nicemice.net/par/

kisielk · on Nov 27, 2011

Doesn't mean humans have to do it. When using vim for git commit messages it wraps them to 74 characters for me.

barumrho · on Nov 27, 2011

I hate to ask an off-topic question here, but is it considered good/bad practice to word-wrap plain text emails?

JoshTriplett · on Nov 28, 2011

Good practice. If you don't word-wrap plain-text emails, each paragraph will show up as one long line; the program reading your mail on the other end can't wrap it automatically, because it might represent code, ASCII art, terminal transcripts, or something else that shouldn't get wrapped.

So, wrap your paragraphs in plain-text email at some sensible column. Common convention suggests 72, because that allows for a few rounds of quoting before it passes 80.

caf · on Nov 27, 2011

This might not actually be that off-topic, because I think the underlying reason that git commit messages are stored line-wrapped is for ease of sending them unmolested through email.

telemachos · on Nov 27, 2011

I think so, yes. Certainly, something like this is very common in Mutt configuration files:

    set editor="vim -c 'set tw=75 ft=mail noautoindent'"

The 'set tw=75' bit, sets the text width to 75 columns.

JoshTriplett · on Nov 28, 2011

Not actually necessary these days. vim knows to match the names used for mutt temporary files, and automatically puts uses mail mode and a sensible text width.

pyre · on Nov 28, 2011

IIRC tw=72 is set whenever you change the filetype to 'mail'

nodata · on Nov 28, 2011

It makes the committer think longer about their commit message.

rmc · on Nov 27, 2011

Am I the only one that thought this was someone asking Linus Torvalds to write better commit messages?

mapleoin · on Nov 27, 2011

The submission text was right, though. It would have been a comma instead of colons if "Linus" were in the vocative case.

itmag · on Nov 27, 2011

Et tu, Line?

secoif · on Nov 27, 2011

Perhaps it would be more clear if the quote was in quotes.

caw · on Nov 27, 2011

I agree. I was looking for the irony in that message where Linus didn't follow his own style guidelines...

drivingmenuts · on Nov 27, 2011

Isn't the head guy pretty much allowed to ignore style guidelines in his own project?

I'm not saying that's a good thing, mind you.

billpatrianakos · on Nov 27, 2011

I don't think anyone is allowed to ignore guidelines. When you know the rules well, you are allowed to bend and break them when it makes sense. But people as a group are often not thoughtful so we have to apply rules to everyone even though a few individuals don't need rules applied to the, because they know what they're doing and generally don't cause trouble.

If we leave it up the masses to be on the honor system then the problem would be much larger. Everyone thinks they're awesome enough to bend the rules when they aren't. It always seems like the people who suck the most are the ones who believe they're the most skilled too. Why is that?

tomjen3 · on Nov 27, 2011

Guidelines, by definition, only exist to guide people.

So yes, you are allowed to wander past them, ignore them, and use them as you believe most efficient.

jacknagel · on Nov 27, 2011

It's hard to understate the importance of this.

GitHub's online editor has a default commit message along the lines of "Edited path/to/file", and I see a lot of pull requests with that message, and nothing else. That's about the most useless message possible, since it adds no information that isn't already implicit in the commit. It would be better to leave it blank and force users to at least write _something_.

tl · on Nov 27, 2011

Be careful, we do that by default and we get a lot of "commit", "commit", "commit $date", "merge commit" messages from some programmers. Either a programmer accepts that a meaningful message is important on commits, or they don't. It's a people problem not a technical one.

tomjen3 · on Nov 27, 2011

The problem with that is that at least when I am merging something I often (with Mercurial anyway) don't have anything useful to say -- the commit is even marked specially as a merge commit. Other than bringing in one branch (which has hopefully been correctly commented) no changes were made.

So really, what do you want them to say?

arkitaip · on Nov 27, 2011

Ah, but what if we encouraged programmers to write better commit messages? What if we splashed on a bit of gamification to raise everyone's level of awareness? Maybe a badge on your profile page...?

drivingmenuts · on Nov 27, 2011

How would you objectively determine the usefulness of a commit message, though? While it might contain all the necessary parts, there is no way to determine if the sum of the parts adds up to a useful whole.

Also, badges are done to death. Everyone's got badges these days - so much so they're like ads. Personally, I've developed a blind spot to most of them.

arkitaip · on Nov 27, 2011

Objectivity is irrelevant and impossible in this context of raising awareness and fun. If the mechanism was something as simple as an upvote next to the commit message, you would let the community define usefulness of the commit message.

andrewflnr · on Nov 27, 2011

Upvoting itself might be all you need.

tomjen3 · on Nov 27, 2011

Any you will determine that, programmatically, how exactly?

Besides, badges are for kids.

secoif · on Nov 27, 2011

Basically, your git log should read like a blog about your project.

zacharyvoase · on Nov 27, 2011

Not to sound like I'm boasting, but most of my commits are far too atomic to warrant more than a first line. If I need to write several paragraphs about the changes I'm making, it's definitely too large a commit.

dustingetz · on Nov 27, 2011

  > If I need to write several paragraphs about 
  > the changes I'm making, it's definitely too 
  > large a commit.

careful, this is subjective, and a lot of people consider it best practice to commit in small chunks but rewrite local history right before we push to squash everything into one changeset for the issue. the idea being it keeps the master history high level which is more useful when looking at other-peoples-commits.

MostAwesomeDude · on Nov 27, 2011

So, you've never had a commit like this one?

    deep/magic: Commit changes even if not dirty.

    Changesets are somewhat magical these days (obviously!); even if they are
    not dirtied; they could end up being really, *really* bogus as far as
    their on-disk status. The reason, as far as I can tell, is that in
    deep/wizardry, we are being very liberal about our modifications to the
    on-disk data structures without actually considering whether or not any
    other owner of that structure is pending changes to commit. As a result,
    we could end up heavily corrupting these structures if we're not careful.

    I'm not altering the comments in the file because, frankly, the comments
    imply that we should have already been doing this. I just changed the
    code to match.

    Fixes #4242. Finally! Time to go grab a beer. :3

    Tests: +5 working

    mad/deep/magic.py | 1 +
    1 file changed, 1 insertion(+), 0 deletions(-)

prodigal_erik · on Nov 27, 2011

I would prefer to see that rationale in the comments or some implementation doc or something. In the git history it's eventually going to be troublesome to unearth, unless the change remains intact enough for "git blame" to give you a pointer to that old commit, or your whole team standardizes keywords for more searchable commit messages. And if your team does have a way to find that explanation, you then need some way for them to discover whether it has become inaccurate over time.

rmccue · on Nov 28, 2011

Comments about the change itself, and the rationale behind it, should go in commit messages. Comments about the implementation itself should go in the code. Granted, there is some overlap between the two, in which case, go with code only (since it's part of the commit anyway).

MostAwesomeDude · on Nov 28, 2011

You are implying that "the team" actually reads that section of the code, and furthermore that they will actually attempt to fix any issues found in that section rather than just go find the original author and ply him with beer to get him to fix it.

billpatrianakos · on Nov 27, 2011

No one is saying to write a lot. The point is to write enough. You should basically be able to hand someone the commit log and they should be able to easily figure out how the current state of the code came into being without having to dig into the code.

Small commits are fine and often don't warrant several lines. I've done that too.

It's often a fine line between a commit that's too big and too small. At the same time, just because a commit warrants a few sentences about it that doesn't necessarily mean it was too large. I don't want to keep being long winded here so I won't go into examples as I think we've all seen situations like what I'm talking about but I'd be happy to add an example later if necessary.

rbxbx · on Nov 27, 2011

http://stopwritingramblingcommitmessages.com/

adambyrtek · on Nov 27, 2011

This is a good post about readable commit messages:

http://tbaggery.com/2008/04/19/a-note-about-git-commit-messa...

moe · on Nov 27, 2011

I try to stick to that format but very often struggle to squeeze a meaningful summary into the first line.

74 chars is just awfully short...

__david__ · on Nov 28, 2011

I have the same problem but I decided that first summary line is ok to be a little long. I regularly have my summaries be about 100 to 120 chars long. My rationale is that you aren't reading a paragraph, just a single line in isolation. For me, the hard part about reading long lines is jumping from the end of one to the beginning of the next. When the entire paragraph is on one line that problem just does not exist. Plus I always give a 1 line blank before the body of the commit so there's less visual confusion.

dustingetz · on Nov 27, 2011

we use

  ISSUE-1234: persist client-side display configuration to settings

and our issue tracking and code review software recognize issue numbers and hyperlink to the actual issue, with comments, reported-by, description, environment, replication steps, related changesets, code reviews, etc. pretty great especially considering multiple commits on the same issue which can't be rewritten into one commit because they've already been pushed. it also makes `git log --oneline` useful.

adambyrtek · on Nov 27, 2011

Is this usually the case that exactly one commit maps to a single issue? I'd rather expect a whole branch to cover one issue (unless it's completely trivial). In this case, repeating the same issue number in every commit message on a branch doesn't seem very useful.

JoshTriplett · on Nov 28, 2011

For bugs, I'd expect a single commit to provide the fix; otherwise, the bug report needs splitting into multiple independent bugs. :)

Feature requests might require a whole branch, though. But in those cases, each individual commit in the branch would build towards the feature, and should reference the feature request.

adambyrtek · on Nov 28, 2011

From my experience, a bug can be simple to describe and atomic, but still require a significant amount of work to fix. I'm all for linking to bugs and feature requests in commit descriptions, but I wouldn't make it a hard requirement and put on the first line. Your mileage may vary.

melloclello · on Nov 28, 2011

I'm terrible. I'll edit things randomly then make commits before I'm about to do something.

danssig · on Nov 28, 2011

This is one of the great things git makes it easier to do.

obtu · on Nov 28, 2011

git commit -am "Catch-up commit" is a favourite of mine. I could always rebase it at some point…

billpatrianakos · on Nov 27, 2011

At first I thought someone was asking Linus to do this.

But this comes up over and over and people should really start doing it! Especially on large and/or well known projects. This isn't just something that's useful to others but it can help you too! I'm a naturally long-winded person but the point isn't to write an essay, you just have to sum it up on line 1 give some context and details for the commit, then let us know who you are and how to get in touch.

Fortunately I've never had to revert Amy changes in my work (weird, right?) but if and when that day comes I'll be prepared. 99% of my work is done alone but I still write some decent commit messages. Who here remembers the date of your changes? Do you remember the exact state of the project at that time? Even if you remember that "well, the project was in X state Y commits ago" it'll only help you for about 5 commits max then you'll forget.

You need to know what you were doing and so do others. Even if you don't plan on working with others, I'd you're on GitHub with a public repo someone may unexpectedly like your project and want to see what's up with each commit. I recently had a project up on github that I thought was literally only useful for myself. It was a basic brochure style website for someone I made as a favor. Well it turns out someone here on HN wanted to learn to code, got in touch with me privately, then started exploring my github repos nth at guy actually used the stupid website I was building as a way to look at some code, tear it apart, and see how it worked. I was flattered and now I'm glad I write decent commit messages even if I'm only working with myself.

The point is, you'll never know when someone else or even yourself will need to look back at the logs and if the commit message is nothing but a date or something like "fixed the link" then you aren't reverting to a known past state - you're guessing.

I know a lot of Git newbies get scared after pressing return in the command line and nano or whatever built in text editor pops up. I didn't know how to end the commit message the first few times and save it or get back to the "normal" terminal so this may be part of why this happens but not a big part.

In any case, I feel strongly that this is an important thing that's overlooked and I'm glad it was brought up again. Hopefully people do it.