Hacker News new | past | comments | ask | show | jobs | submit login
My favourite Git commit (2019) (dhwthompson.com)
714 points by karagenit 11 months ago | hide | past | favorite | 398 comments



For better or worse, my experience as a GitHub cofounder and author of several Git books (Pro Git, etc) is that the Git commit message is a unique vector for code documentation that is highly sub-optimal.

The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line. So in the case of this commit example would be the very simple message of a generic "US-ASCII error" problem. Everything they talk about in this article is what is great about the _rest_ of the commit message, which, given modern tools, is _almost never_ seen by anyone.

The main problem is that Git was built so that the commit message is the _email body_, meant to be read by everyone in the project. But for better or worse, that is not generally the role of this text today. Almost nobody ever sees it. Unless it's discussed in a bunch of patch series over a mailing list, nobody reads anything other than the first 50 chars of the headline. It's actively difficult to do, by nearly every tool built around the Git ecosystem.

Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages that are relevant to the code blocks you care about is not widely known and even if you find them, still only show the first line. Then you need to "git show" the identified commit SHA to get this long form message. There is just no good way to find this information, even if it's well written.

This is one of my biggest complaints with Git (or, indeed, any VCS before it), and I think why people just don't care much about good commit messages. It's just not easy to get this data back once it's written.

If you want an example of this, search through the Git project's history. Run a blame on any file. It's _so hard_ to figure out a story of any function implementation in any file, but the commit messages are _pristine_. Paragraphs and paragraphs of high quality explanation for almost every single commit. Look at any single commit that Jeff King has done for the last decade. Hundreds of hours of amazing documentation from a true genius that almost nobody will ever appreciate. It's horrifying.

I don't know exactly what the answer is, but the sad truth of Git is that writing amazing documentation via commit message, for most communities, is almost entirely a waste of time. It's just too difficult to find them.


As someone who has contributed to Git since before GitHub existed and who maintains legacy code, I simply cannot disagree more. I use `git blame`, `git log`, and `git show` in the terminal all the time. It's trivial to follow the history of a file. It takes me seconds to use `git log -G` to find when something was added or removed.

Nothing pains me more than to track down the commit and then find a commit message that's of the form "bleh" or "add a thing" when the developer could have spent 60 second to write down why they did it.

Nothing gives me more joy than to find a commit message (often my own) that explains in detail why something was done. A single good commit message can save me hours or days of work.

Let me also just say, and this is a bit of shot: GitHub contributes to the problem of bad commit messages. If I'm lucky, folks have put some amount of detail in the PR description, but sadly that's not close at hand to the commit log. It's another tool I have to open. Usually though, the PR is just a link to Jira, so that's another degree of indirection I need to follow. Then the Jira is a link to a Slack conversation. And the Slack conversation probably links to a Google doc.

As an industry, we're _terrible_ at documentation. But folks like Jeff King are fighting the good fight. At the end of the day, I don't think the problem is with the technology. I think it's a people problem. Folks perceive writing documentation as extra work, so they don't. There's no immediate value to it. The payoff comes days, weeks, or months later.

Please, write good commit messages. Just spend a minute saying why you did something so that every commit isn't a damn Chesteron's fence exercise. Put it in the commit message where I can easily find it. Your future self and I thank you.

Edit to add: I didn't address your argument, that commit messages are too hard to find.

First, I don't find this to be true. I rarely have trouble following the history of a line of code, a function, or a file.

Second, commit messages have value at the time they are written even if they are never seen again. I find that writing a good commit message helps ensure that I've written in code what I've intended to (I often view the diff while writing the commit message) and they have value to the people reviewing my code.


The thing is that writing a good commit message for future people doing `git blame` is only worth it if it's a line of code which someone in the future will look at and need to know why it was changed from its previous form to the current form.

If you simply want to comment the current state of the code, you should add a comment in the code.

No one will ever need to know in the future why that particular space character is an ascii space, so the whole commit message is just a blog entry in the wrong place.

It would have made sense to just put a comment at the top of the file saying "make sure encoding is whatever".


Then don't write commit messages for the future, write them for reviewers.

Seriously, as somebody who reviews a lot of code, well-written commit messages are a godsend.

It's an awful shame that GitHub doesn't allow commenting on commit messages. It's as if GitHub is being run by people who just don't know how Git is meant to be used.


I write commit messages for future-me. Sooner or later I'm going to encounter the same problem again and wonder how I solved it last time. If I have a vague inkling that I dealt with this before, all I have to do is searching through my commit history and I can find it again. I can search my author (me), I can search by date, I can search by what files I touched. It's lovely.


> It's an awful shame that GitHub doesn't allow commenting on commit messages.

You actually can comment on a commit itself. I'm in the habit on middle-clicking on the sha1 link of commits in a PR and looking at the commit itself. You can comment on lines in the commit, and there's a text area at the bottom where you can comment on the entire commit itself. I'll then follow up with making a comment on the PR linking the commit (pasting the sha1 link) and saying I made a few comments here.

> It's as if GitHub is being run by people who just don't know how Git is meant to be used.

Github wasn't really designed with code review in mind. A lot of the features they added over the years for review appear to be hacked on rather than fixing fundamental design issues (like being able to comment on commit messages without having to jump through a bunch of hoops).

Review systems like gerrit, phabricator, review board, or even email, do a much better job at exposing individual commits and their associated metadata like the commit message.


I don't think they were suggesting to review the individual commits, rather the (individual) commit messages. Commit messages are text, so you could have a similar line by line click-and-comment review interface as you already have for the code changes.


> I don't think they were suggesting to review the individual commits, rather the (individual) commit messages.

That's a good point.

> Commit messages are text, so you could have a similar line by line click-and-comment review interface as you already have for the code changes.

It would be nice if something like that was available in Github. The closest thing you could do would be to copy the commit title and body and paste it as quoted text in the text area and then comment on it inline.


> Seriously, as somebody who reviews a lot of code, well-written commit messages are a godsend.

This is the issue; are commits in your codebase for the reviewers? Or for FutureDev to see what the hell happened? These are often very different things, and most process models favor one of these at the total expense of the other.


Interesting that you perceive a dichotomy there, can you elaborate where you see the conflict?

As a reviewer, I see myself as a (extremely near, time T+epsilon) future developer who is trying to see what the hell is happening.


There's a dichotomy between a history of actual code changes, oopsies and redos included, and "a carefully crafted story of the intent of this change". One is an ugly reality, the other is a pretty story of intention. The process vs. the final result.

Both are useful, but in different contexts. The latter is good for a PR, but when optimized for that often loses a lot of other context, and is meant to be a throwaway bit of data that once the PR is approved, never seen again, nor meant to be seen again.

The former can delve a bit into what experiments were tried and abandoned and can give a lot more meaning to the final result.

> As a reviewer, I see myself as a (extremely near, time T+epsilon) future developer who is trying to see what the hell is happening

I generally find this to not be the case; reviewing is to see what this might "break" and has IME been extremely transactional.

Not all PRs nor reviewers are like this of course, and you may not be one. But, again, IME, this is the "modern" workflow.


> The former can delve a bit into what experiments were tried and abandoned and can give a lot more meaning to the final result.

I haven't found this to be true at all. Commits don't tell me anything other than when the coder left their desk for the moment. When there's a policy that allows people to have a "rough, ugly history", where a coder chooses to commit is completely arbitrary. I would argue that the clean, crafted story is the only helpful one.

You can argue that a coder shouldn't be committing unless they are at a natural milestone that can be annotated, but I would consider that the "crafted" story.


Sometimes a comment is appropriate. Sometimes a commit message is appropriate. Sometimes I need both. Often when dealing with legacy code I find neither. I'd be happy with either.

A commit message lets me tell a short story about a change that touches multiple locations in the code base. Maybe no one part of the change is all that tricky.

A commit message also allows me to explain why I'm making the change, whereas a comment may explain why the code is the way it is.

Commit messages and comments have overlapping use cases, but the Venn diagram is not a circle.

$0.02.


>Sometimes a comment is appropriate. Sometimes a commit message is appropriate. Sometimes I need both.

And often your get neither...


    The thing is that writing a good commit message for future people doing `git blame` is only worth it if it's a line of code which someone in the future will look at and need to know why it was changed from its previous form to the current form.
Well what about this example: I removed a few lines of code and explained in the commit message why I thought it was correct to do that. If somebody (possibly me) comes looking for that code and realizes it's not there, they'll be much happier to see some sort of explanation rather than a "removed lines" message.

Regarding commenting in the code vs. in the commit message, sometimes I copy-paste my explanatory comment if there is one into my commit message.


Right. You've given an example of exactly what I said was the only reasonable use case for detailed info in the commit message: someone in the future will need to know the history of that particular piece of code. It seems like the point of your specific example is to say 'in some cases you might want to know the history of a gap'. Fine. That seems like a nitpick to me.

No one in the future will need to know the history of a particular ascii encoded blank space (among a whole file of ASCII encoded blank spaces). Anyone who needs the general info that the file needs to be ascii will be helped by it being somewhere else, as opposed to in a random commit message.


But virtually every diff is one that someone in the future might want more information on. You can't know that they won't until you get to the end of the future and haven't needed it.


All communication is communication with people in the future.


> the whole commit message is just a blog entry in the wrong place.

Right. All this wonderful information and detailed error messages need to be findable by someone searching the same error. Someone digging into the code is a very different use case and they need a tiny fraction of that information.


The example in the blog post would be a much better example if some kind of test or linting step was added to catch these white-space errors, to explain the need for catching such errors.

Pro tip, you can write both comments and commit messages.


> If you simply want to comment the current state of the code, you should add a comment in the code.

I think you mean "past state of the code"...

These comments rarely get updated. My favorite recent one was several sentences describing a data structure and how it mapped out statuses, written about a decade ago. Barely a year after that comment and its code was written, the entire thing was re-written with completely different structure - and the comment left unchanged. Left a co-worker completely baffled due to inexperience with perl, we figured out what happened because of svn blame.


The drifting of code from comments is a problem I would love to see solved.

I've seen tools that can compare the git commit dates of code with nearby comments and that's a good start. However, there are potential problems with that, such as code and the comments that discuss the code not being near each other; or the code being updated and there being no need to update the comment

I think literal programming might help here, but that's an entirely different topic really.

Looking for more advanced tools that that and I suppose we're into the world of AI - asking the tool to understand both the code and the comment and to compare the underlying meaning.

Code review is an option but outside of an organisation that's difficult to do and besides, I think the problem would be best solved by something that is repeatable and part of the build process. And I'd love to be able to have a git commit hook that can say, "hold on! you've updated code but there's a comment that now looks old". That's the dream.


Mistakes can happen. But if code comments are frequently not updated as the code evolves, it is a level of lazy that will probably manifest in other ways as well.


Definite agree there: Be it git or svn I spent a huge amount of my bugfixing and refactoring time in the history figuring out why things are the way they are.

> Usually though, the PR is just a link to Jira, so that's another degree of indirection I need to follow. Then the Jira is a link to a Slack conversation. And the Slack conversation probably links to a Google doc.

Assuming all those links in the chain still exist. Before Jira we had FogBugz, almost all those old cases are gone (some were imported). And we used Flowdock for 10 years, that's completely gone.

Commit messages are the only thing we can rely on for this history. Use it. And try to avoid squashing commits, that erases this history - yes, even for a feature branch, changes from code review should be separate from the initial push, explain why it's being changed so we don't make the same mistake later.


> As someone who has contributed to Git since before GitHub existed and who maintains legacy code, I simply cannot disagree more. I use `git blame`, `git log`, and `git show` in the terminal all the time. It's trivial to follow the history of a file. It takes me seconds to use `git log -G` to find when something was added or removed.

I 100% agree with this. I do this all the time. I also agree with the rest of the post. The sentiment raging against these longer git commit messages smells very much like elitism to me.


I'm mixed on this. My project has a bug tracker. A commit is required to have a bug id. The bug tracker has entire discussions of what lead to the commit so it's not clear to me that a detailed commit message is a plus when the real detailed info is in the tracker. Yes it's indirect but there's no way I'm going to summarize the entire issue discussion.

Maybe this is a job for machine learning. Read the code, read the commits, read the bug tracker, add a git super-blame that asks the LLM to summarize why every line is the way it is and what it's doing


> A commit is required to have a bug id. The bug tracker has entire discussions of what lead to the commit

Companies do change bug trackers and ticketing systems and those links may no longer work years down the line.

> The bug tracker has entire discussions of what lead to the commit so it's not clear to me that a detailed commit message is a plus when the real detailed info is in the tracker. Yes it's indirect but there's no way I'm going to summarize the entire issue discussion.

But summarizing it can be one of the most valuable things you can do for a maintainer who has to make changes years after you've moved on. For one thing, the problem and discussion is fresh in your mind and you understand the context. In a few minutes, you could summarize the problem, the approach taken to fix it and alternatives that were considered but not used because the chosen solution clearly didn't have an issue/was more efficient, etc.

Even if you didn't want to do that, you could just copy and paste the entire discussion text at the end of the commit message so that even if the bug tracker is no longer in use in the future, the discussion itself was preserved in the commit history and accessible via git log or blame.


> > A commit is required to have a bug id. The bug tracker has entire discussions of what lead to the commit

> Companies do change bug trackers and ticketing systems and those links may no longer work years down the line.

I've experienced this twice, we switched from Bugzilla to FogBugz to Jira in my time. With one relatively small exception in the FogBugz to Jira transition, all past case information was lost.


This is why at work the only required rule for commit messages is that they include the story number, so we can very easily find at least the general reason for a change from git blame.


> It's trivial to follow the history of a file.

If the committer uses the "show history of a file" process model. These days, it's mostly "squash commits until I get a good/flattering 'story' of what I did", which removes typos, failed experiments, mid-thought commits, and any other blemishes of _what actually happened_.

Commits CAN be used as great history, if history is allowed, but I've found that "modern" workflows tend to the rebase/squash side of things and also are mostly write-only.


I still wonder how much of rebase/squash-heavy workflows would disappear if UIs like GitHub and the CLI itself defaulted to `--first-parent` style views (with optional drilldown) and used the power of navigating a DAG for a little more good. All of this good commit metadata lost just because subway diagrams are pretty but also so many people find them confusing and messy. Or because GitHub shows commits as a flat confusing list with an order that only makes sense if you saw the subway diagram but GitHub's default view doesn't draw the subway diagram.


How does this work in the face of unrelated refactoring? Say you first fix a bug somewhere, with a great commit comment. Then some refactoring happens, and the affected function is moved to a new class in a new file. Are you still able to track the original git comment?


Say you found the current commit through `git blame`. You run `git show` on that commit. The diff shows that the function you're interested in was actually moved so from the diff output you can see the previous filename and you have the function name. You could use:

  git log -GsomeRandomFunction -- path/to/some/random/source.ext
That will then find the commit where `someRandomFunction` was removed from `source.ext` and then the commit before that where it was added to `someRandomFunction`.

Git log has a gazillion options and I've probably used them all at one time or another, but 99% of the time, the only ones I need are `-G` and following a particular file.

https://git-scm.com/docs/git-log#Documentation/git-log.txt--...

I also have `diff.renames` set to "copies" and `diff.algorithm` to "patience" in my `.gitconfig`:

https://git-scm.com/docs/git-config#Documentation/git-config...

https://git-scm.com/docs/git-config#Documentation/git-config...


> Edit to add: I didn't address your argument, that commit messages are too hard to find. First, I don't find this to be true. I rarely have trouble following the history of a line of code, a function, or a file.

I don’t think this is proper way of reasoning. What is hard and easy is subjective. And you discuss it as it would be objective. Word against word. It would be wise to have some poll and see results.

If one geek is writing and reading commit messages doesn’t mean it’s easily accessible by everyone. It’s hard to make something as a widespread standard if tooling doesn’t make it super easy to access. Allow people to leave kudos and emoji to other people commits messages and people will start making them better :D And later show heroic people with git —-stats


His credentials indicate that it may be possible that his arguments are based on data while your credentials and evidence indicate personal, anecdotal experience. Therefore I would trust his reasoning more. Additionally, I personally identify with it.

I mean a git developer finds git easy to use? That's biased data.

I love how both of you dropped your street cred before launching into your reasoning. It just shows how much more credentials convinces people rather then the argument itself. Normally that stuff logically doesn't matter and people are just doing it to grab some "authoritah" but in this case your backgrounds actually contributed to the arguments.


> The main problem is that Git was built so that the commit message is the _email body_, meant to be read by everyone in the project.

I find this very hard to believe. Isn't it "everyone who is interested in the commit subject/files touched should read the body". Why would anyone else read immutable historical documentation?

> Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages that are relevant to the code blocks you care about is not widely known and even if you find them, still only show the first line. Then you need to "git show" the identified commit SHA to get this long form message. There is just no good way to find this information, even if it's well written.

This sounds like you are joking. Any good IDE will be able to annotate each line with blame info, and show the diff at the press of a button. On such diffs, the IDE should allow recursive blaming on context/deleted lines. Tools like Tig allow exactly that.

GitHub certainly does make it hard to see commit messages, I give you that :)

> Hundreds of hours of amazing documentation from a true genius that almost nobody will ever appreciate. It's horrifying.

?? It's not like it was written for fun. This documentation attached to a commit exists to reduce the risk of accepting the patch from someone who might not be around in future, to fix any problems introduced. By disclosing all their relevant thoughts, the author shows their good intentions: they enable others to build on top of their work. If the author kept their thoughts to themselves they would gradually build up exclusive ownership of the code, which is often not a good idea. Also a commit message serves as proof of work, which can be important when there's too many patches. For commercial projects some of this is less important.


I might be in the minority, but parent's comment is probably about people like me: most of my coworkers have context free, or at best succinct commit messages. I never read more than the first line listed in the commit list, and don't even assume the description is always accurate.

Instead I'll spend my time stalking the related merge request, where the full description of the whole change resides, with probably a link to the ticket or reference documentation, and all the back and forth on why something is or isn't a good idea.

I think the world could be a better place if all of that was in git directly, but that's also utting much more burden on an already complex tool.


> I find this very hard to believe. Isn't it "everyone who is interested in the commit subject/files touched should read the body". Why would anyone else read immutable historical documentation?

If you think about who created GIT (Linus) then it suddenly makes sense that the commit message is like an email body since most of the Linux kernel collaboration is done via a mailing list?


> Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages

I am terrible at git on the terminal, but with IntelliJ or emacs and magit, I can trivially find every commit ever to change a file, and easily navigate the commits to see every full commit message. It's not hard when you use a proper tool, and I have a feeling almost everyone has something like that?! Do you really try to stick with the git CLI and memorize hundreds of commands and flags?? Why?!


Really simple answer: Repeatability. I am not saying it is the only one right blessed answer, but if you really want to know why people haven't moved to pure GUI interfaces, imagine describing to someone how to add a new directory to their path.

  fleet $HOME/.config/fish.config
  # ADD this line somewhere
  set -x PATH /opt/git/bin $PATH
Or: 1. Either hit WINDOWS-E and right click on This PC and select properties (it might be called something other than This PC if someone renamed it) or either press WINDOWS key or click Start or click the Windows icon (if you don't see them try mousing into a corner of your screen (typically bottom left) until they and the rest of the bar un-autohide) look for and click a gear symbol (should expand to say Settings if you hover), click System, on the left and the bottom you should see About. 2. Click the text Advanced system settings (on the right), look for a new window with a set of tabs, you want Advanced. Click the button Environment Variables. 3. In the top columnar box EITHER find a variable named Path, highlight and click button Edit, in a new window click button New, type '/opt/bin/git' in a text field that has appeared at the bottom list items, click OK OR click the button New, in a new window enter Path for Variable name and /opt/git/bin for Variable value, click OK (you shouldn't need to Browse Directory or Browse File). 4. Click OK button, click OK button, close Settings window.


Have you ever heard of "abstraction"? People that actually use windows can handle opening the start menu as a single part of a step. There's no conscious checklist for how the UI can be customized.

If you're going to make that into a complicated mess, then you absolutely do not get to assume the user understands "add this line somewhere" or has "fleet" installed and set up the way you expect.


Yes, I think we agree that abstraction is great (with or without "scair" quotes.) My point is that CLIs are valued as tools of explication, repeatable explication. I have actually used Window since 3.1. I cherry picked a particularly juicy example that I run into a lot.

> If you're going to make that into a complicated mess, then you absolutely do not get to assume...

As far as tooling goes the GP mentioned IntelliJ so I rewrote code with fleet, I could have easily have picked emacs or vim or bash or zsh or tcsh instead of fish and the complexity of interface would have remained static. I think HN formatting tools are partly to blame for the messiness but if you look at any quality set of docs describing a complicated computer interaction, to achieve the same level of repeatability as text-based, POSIXy interactions you are going to need a lot of screen shots and a few this or thats. WHICH is fine! Remember software engineering is about trade offs!

EDIT: CLI allows for abstractions like $EDITOR and $SHELL


On Windows it's actually just:

1. Press Win key 2. Type env 3. Choose system or account


It's even better because thanks to the Start Menu randomization process either could appear first in the results. Sometimes they will switch position after being presented.


Thanks! This is what I was secretly hoping for. I am doing this a lot lately.


Your fictional example is not a good comparison -- I just can't imagine the scenario where you need to explain to someone who doesn't know how to modify their path why they need to add something to it.

For someone actually using git (and the CLI, at that) I'd expect to be able to say "oh, make sure git is in your path" and for them to understand how to check and set that, or at least be able to Google it and follow the instructions themselves. Likewise I'd ask something like "Can you cherry pick just that bug fix into a new PR so we can merge and deploy it today?", not give them a series of git CLI commands to paste in.

My observation of git beginners is ones using CLI say things like "oh, I screwed up my repo and had to clone a new copy". Good GUIs don't easily cause this situation, and mostly let you see and fix what happens when you do some weird accidental merge or rebase or someone else has force-pushed.


> I just can't imagine the scenario where you need to explain to someone who doesn't know how to modify their path why they need to add something to it.

Sounds like someone hasn’t had to train fresh graduate engineers for awhile ;)


Insufficient explanation about about how to add something to PATH (specifically the tools for compiling java) meant that I started programming 2 years later than I would have otherwise.


Yeah, I think I conflated a git specific question from the GP and a more general CLI question, my bad.

The argument can be made the interfacing with git is bad whether with mouse or with keyboard. My git secret weapon is to ask myself how do I make git do this thing that is easy in Subversion or Fossil and then I do that thing and I write it down so I can do it again in X number of months.


If the tool calls git(1) then it can show you the script that your actions produced. Magit has something like this but I’ve never used it for that (since I also use git(1)) so I don’t know if it captures the whole context/commands.

I used a GUI frontend to R in a statistics course. Never needed to write R myself.


Don't even need the set -x, can just use fish_add_path for convenience.


IME git abstractions make it easy to read and navigate standard workflows, but incredibly difficult to repair issues that arise due to divergence of some kind or another because they are so opinionated.

I use git 99% in the terminal, and 1% in some git tool for visualization, but I find that a lot of people use it in the opposite way and have problems working with others that use a very slightly different workflow. You don't need to memorize hundreds of commands and flags, honestly a dozen or two gets you to expert status in most respects.


I don't have any problem at all, when some really tricky stuff needs to be done, I google for a solution and run whatever command magic I find. If you don't need to google for git commands to do uncommon things, I imagine you have a huge capacity to memorize things, good for you, but most of us don't.

I do understand how git works and could use the CLI most of the time if I wanted to, but there's exactly zero reason to do so. The GUIs offered by modern tools make it much more convenient and efficient to do things correctly. You really should't commit stuff without doing a careful review of the changes first, which is terrible to do in the terminal compared with using a GUI for that, for example.


ChatGPT is really good at giving me the git invocation I need for weird complex stuff that doesn't come up everyday.


Terminal for repeatability, gitlab for visualization is a good combo I’ve found. Push your branch and a great diff is waiting for you.


I don't find it more difficult to use or remember commands for than remembering how to accomplish similar tasks in some GUI (especially if that GUI is emacs). And unlike most GUIs (emacs may be an exception), I can trust that my knowledge of the git CLI won't become out of date when my GUI tool inevitably undergoes a UI redesign of some sort.

But more importantly, the CLI allows my typical workflow where I chain together a bunch of git (and other) commands in a row, allowing me to just type in, for instance, several different commits, their messages, and what files should go into each in one go without having to break my concentration by having to move around in some GUI between commits. Sprinkle in some stash manipulation and interactive rebases, compilation, and unit testing, and you'll really start to see how the CLI allows you to offload some of your working memory to your invocation in a way that a GUI just can't.


> Do you really try to stick with the git CLI and memorize hundreds of commands and flags?? Why?!

Because IntelliJ is... less capable than it should be. Personally, I find `git add/commit -p`, `git diff` far easier to use than IntelliJ, and because Python is a fucking mess I had to install the codecommit git helper into a Python venv... but you can't tell IntelliJ to use that venv's $PATH for `git pull`/`git push`.

Oh, and you can't really macro complex stuff in IntelliJ, whereas I can do a single-command release and push-tag of a project with about 30 Git submodules in a (convoluted) Bash one-liner.


I don't know IntelliJ well, but I would be surprised if they did the rather expensive rename following that the multiple -C invocations did. Maybe someone can inform us here? GitHub definitely does not, but that is 100% my personal fault I assume.


I was mind blown reading this also - are we not programmers for the sake of laziness in the face of these kinds of "problems"? I have to hail Tim Pope for Fugitive.vim also. HAIL TIM POPE!


100% this


This is a failure of GitHub etc. GitHub tries to dumb things down for users because I guess it's judged they can't reason with commit histories and this is one of the consequences. The mess in especially private GitHub repos is beyond belief sometimes.

The thing is there's nowhere else for such documentation to go. It's not appropriate for a code comment. But we've got a whole generation of developers now who think git is GitHub and the only purpose of git is uploading changes to GitHub.

Git sucks, but it sucks a lot less than everything else. But we need to go back to basics and understand what version control is actually for.


It's great for historical research though. It's one of the few pieces of documentation that will live with the code forever. github and other forms of centralization are not open data formats that folks trivially backup/convert/carry forward. They usually leave the data behind if they move the project somewhere else.

So no, I don't think it helps the current community much either. But it helps the debugger years later.


Is it great for historical research? I feel like the format and tooling around it is uniquely _not great_ for historical research. I think it's optimized for discussions before integration, which is largely what PR descriptions and comments are largely used for now.

I feel like given great commit messages, determining a story and useful history around any block of code given the Git tooling is incredibly difficult even if there are _amazing_ commit messages.

Like say you are trying to determine why a 10 line function is the way that it is. You blame it. Not even with the stupid-simple GitHub UI that _I_ originally wrote, but with the more expensive CLI interface that follows renames and ignores whitespace changes, etc. Now you get a list of SHAs of commits and the first 50 chars of commit messages for each line for the last modifications, etc. How do you even stitch those messages into a useful story (in order) to tell you how that function evolved to what it is now and why?


It might depend on which tools you're using. When I'm doing historical research for how a function evolved, I normally run "gitk" on the file, and walk through the commits; the full commit message is shown together with each diff to the file. It used to be even better in the past, when gitk showed the full commit diff, instead of the diff to just the file I passed on the command line, but "git show" on the commit hash (or another gitk which is not filtered to a path) is good enough.


Tediously commit by commit. But it's often better then the alternative. Design decisions and business logic separately from the code or source control are infinitely harder to reference code against, and realistically that documentation will be lost.

At least if you have the git repo then there's at least some chance to be able to git through the history of some code that's kept with the code. Especially for stuff that code cannot document and you're working with devs that seem to be firmly believe that code is self documenting.

Doesn't mean that every code base needs to have amazing git commits. But code bases expected to live a long time at least give some possibility to string together a history after some work.


> I think it's optimized for discussions before integration, which is largely what PR descriptions and comments are largely used for now.

As a GitHub co-founder, whose fault is that? I have seen many great PR descriptions on GitHub that never make their way into the final inclusion in the main/master git history.

Meanwhile the git project links every commit to the message id whence the original patch (for many years now—not the whole history). Which will be available as long as the email archives are out there somewhere.

And the commit messages get reviewed into a good shape. Something that I’ve never seen anyone do on GitHub.


But Github and similar tools actually solved this problem where Git failed to do so. Nowadays people have a setup with Github or bitbucket where they can navigate from a piece of code right to the pull request, where they can read the code review discussion, see the build log, reach linked resources like the Jira, etc.


“Just go to our web app” is not solving the same problem as what Git is trying to solve (the latter sometimes badly, it might be added).


P4V (Perforce Visual Client) is amazing for visual historical research. I haven't seen a git tool like it, but I'd love one. https://www.perforce.com/video-tutorials/vcs/using-time-laps...


I feel like you're complaining about a problem which you helped create.

So, with all due respect, do your part to fix it. For example, by allowing review comments on commit messages in GitHub. Gerrit gets this right, FWIW.


> I think it's optimized for discussions before integration, which is largely what PR descriptions and comments are largely used for now.

This isn't even a git concept though; it's something that was tacked on top of it. What you seem to be saying here is that a third-party tool building on top of git spawned a social movement that moved this layer up a level. Not every project uses github or a github workflow.

> I think it's optimized for discussions before integration

It's optimized for discussion of the purpose of the code unit in question. That discussion can be useful before integration; but pre-integration discussion can happen any way you like. PR discussions work, e-mails on mailing lists work. Face-to-face discussion works.

The real value (for me, I guess; apparently you just don't see it that way) is explaining the purpose (and possibly circumstances) of the commit, after the fact, when I'm looking at it for some reason or other. Not finding the commit, but explaining it once I'm there. A well-written commit message can be absolutely priceless.

Maybe this last point should go in a top-level response to your original comment, but I'm already here, so I'll just say it here. Saying that commit messages are terrible because only short-messages (the "subject line") are shown by default, seems to me about the same as saying e-mail bodies are useless for the same reason, or that file contents are terrible because `find` only lists file names by default. You 'have' to collapse by default, or you'd drown in a sea of commit messages anytime you tried to list anything.

> Like say you are trying to determine why a 10 line function is the way that it is. You blame it. Not even with the stupid-simple GitHub UI that _I_ originally wrote, but with the more expensive CLI interface that follows renames and ignores whitespace changes, etc. Now you get a list of SHAs of commits and the first 50 chars of commit messages for each line for the last modifications, etc. How do you even stitch those messages into a useful story (in order) to tell you how that function evolved to what it is now and why?

Okay, I hear you, this is not the most ergonomic procedure to one-off. But seriously, you have the SHA commits. If you need to do this often, write a tool that takes those SHA commits, orders them based on log order (or chronological order, w/e, pick an ordering mechanism), and prints out whatever information is interesting to you. A simple display that can expand/collapse full messages, diffs, etc. would probably do nicely. It can be a GUI tool, a CLI tool (menu-driven, maybe); whatever works for you. This should not be a big deal to write for the common case, and if you think it's that critical to the community, publish it.


Till the team you are handing off the code to just copies the files and commits into a fresh new repo without any of the history. I had this happen once to a server I wrote, and then like 2 years later the new team comes and asks me if I knew of the server, and I'm like "I wrote it" and then they are all confused.


Well, `git` is still the primary way I interact with a git repository, and `git log` shows the entire commit message by default. So I don't run into this problem.

If some "modern" git frontend is only capable of displaying the first line of a commit message, then this is a problem with that tool, not git itself.

(I'm also not convinced this is a limitation of all modern tooling...)


I can't tell if this is engaging with trolls or not, but I can't imagine that all of your interactions with your codebase are via `git log` with no other flags. Even the with the normal Git CLI that most of us use daily, most of us use `--oneline` or whatever to simplify useful calculations and visualizations like `--graph`, etc. But we're talking here mostly about code archeology, learning about the history of a block of code, so this comment seems somewhat ridiculous in that context.


> I can't imagine that all of your interactions with your codebase are via `git log` with no other flags.

When did I say anything like that?

My point is just that the `git log` command, by default, shows the full commit message. The same goes for `git show`. So a user of the git CLI will regularly see complete commit messages, unless they purposefully request a different format. So, it is not some inherit problem in git that the complete commit message is hard to find. That's just a limitation of certain Git frontends.


> I can't tell if this is engaging with trolls or not, […] that most of us use daily, most of us use `--oneline`

You speculate that someone who uses git log without listing (or complaining about) all their flags are a troll?


Is it possible that you’ve been hit by

https://xkcd.com/2501/

?

  git log | less 

  /whatever
Works OK for those of us who don’t know any git flags.


The only sets of arguments I use to git log regularly:

* `git log branch` because I want to cherrypick or checkout parts of another branch.

* `git log --stat` because what files changed can be a big clue for what I'm looking for.

* `git log -- dir1/ file1/` because I only care about commits to a certain part of the tree.

Other than that, `git log` already provides so much information to /search or even `grep` through that I can't think of any other flags I use regularly, and if you don't use them regularly you forget them.

The real GOAT that people are sleeping on is `git rebase --interactive` where you can go back and edit part of your branch to clean it up before rebasing or merging towards main. The cleaner the commits are, the more useful they become later for other tools like log, merge, rebase, cherry-pick, bisect, etc.


A rebase to clean up your branch is great, and I lean on my team to do this. Unfortunately it's impossible to automate, because it amounts to craftsmanship. I've seen larger teams fall back to squash-merging, which at least discards checkpoint/broken/WIP commits. But it loses the nuance of more complex changes performed in logical stages.


I don't know how my comment was understood to mean that I am unfamiliar with git. My point was that those of us that use the git CLI have no issues seeing the rest of a commit message besides the first line, and in fact this is the default.


Why an explicit `| less`? Git already uses a pager by default.


periodic reminder that `gitk` exists, and has come with git since... pretty much forever? If you're reading `git log`, you really owe it to yourself to run `gitk` at least once to see what you've been missing for over a decade now.


What gave you the impression that I haven't heard of gitk?


When someone mentions an approach, and someone else mentions a different approach, neither comment is "for one person only".


Well, I’ve heard of gitk too but gitk is not available by default on my default installation of macOS so there’s that.


I'm not sure why anyone would use default installations of any developer tooling on Mac? Was the first thing you did when you finished initial startup on your mac not "install homebrew, then install the most up to date versions of git, python, etc. etc."?


gitk is great if you love subway diagrams and want a built-in tool for it. There's so much power in `git log --first-parent` and recursing into drilldowns that current UIs, including gitk, are bad at expressing as a workflow.


Your entire argument boils down to the fact that it's hard to view git blames. It's not.

As stated by other people, IDEs like VSCode and IntelliJ do an extremely good job of showing the blame. And they DO show the entire commit, body and everything at once.


I never considered the idea that it was atypical, but I read full commit message text all the time. There are many different ways to drill down into a commit, and then read the entire commit once you know it's relevant. Even doing a simple git log, and then a searching for some keyword through every full commit message, can be useful.


Many editors have great git blame integration that makes these messages quite accessible.

It's really easy in emacs with magit to view commit messages from git blame view.

I believe vim, vscode, and jetbrains IDEs all make this simple.


Yeah, a lot of these also have Github and ticket tracker (Jira, etc) integration so they'll also pull in context from those, too

Most of the stuff I work on uses merge commits on Github so you can just click the PR # in the merge commit message and arrive at the PR, browse through commit messages, discussion, etc


Using vim-fugitive it's

  :Git blame %


> The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line.

Maybe I do it wrong, but the most basic interface I use to check the git history is `git log`, which shows the whole commit message.

GitHub takes me 18 clicks to find the commits, I don't see why I would even bother using it.


Many engineers primarily or even exclusively use git via githubs interface and have never made a commit with a body.


Those "engineers" go on _The List_


Right, but then maybe the main issue is those engineers, and not the tooling? When I see someone using a hammer the wrong way, I don't usually blame the hammer.


The problem is the network effects. If enough people start hammering with the side to pull out nails, like 95%, there's a chance that could become known as "the right way".

I think a lot of the "get shit done" crowd for instance sees using Github and not "futzing around with git" as the right way to do things for getting shit done.


Sure, but because "the majority thinks it is the right way" does not mean it is. My point is that when the majority of people don't know how the most basic tool they use is meant to be used, then it is a problem.

It would still be better to say "I know how a hammer is meant to be used, but I choose to use it this other way", but that is not the situation right now. The situation right now is that more and more devs don't know that git != GitHub. And that inequality is an objective fact, it has nothing to do with network effects.

Everybody believing that the Earth is flat would not make it flat.


> The main issue is that most of the tooling ... generally only shows the first line.

> I don't know exactly what the answer is

Isn't it obvious? Write better tools. There is no reason you have to be stuck with the deficiencies of what someone else has built. That's the whole point of open-source software.

It's more than a little concerning that a "GitHub cofounder and author of several Git books" has to have this pointed out to them.


>There is no reason you have to be stuck with the deficiencies of what someone else has built.

There is a concerning trend of "we only use vscode" and popular preference shifting to "adjust to popular tool" rather than "use best tool".

This means sadly things like GitHub start to define git even more for your coworkers.


I don't know how it's for everyone else, but I do value the body of the commits from others. It's true that I see only the subject line for most commits. But I eventually read the full body of commits I'm interested in. Honestly, it's frustrating when commit messages don't carry enough context. Sometimes that context fits in the subject line. For others, I expect an elaborate body.


On my work I make 1-15 commits a day. If I have to spend thought cycles on the commit message, that is time that goes from other productive endeavours.

I think, as the original commenter also wrote, this might be worth it in much slower paces projects that is run in another cadence / over mailing lists.

I particularly think that high paced application development do not benefit from git as documentation.


I would argue (rightly or wrongly) that there are two common truths to such a scenario:

Scenario 1, you’re doing a bunch of small changes that work towards a larger purpose. They’re what I like to call “checkpoint commits”. They aren’t the whole story —- just a step along the way to whatever you’re trying to accomplish.

Scenario 2, you’re coding instead of thinking. Making “random” changes until you get what you want, but because you’re continuously delivering, they all go to production. Note that “you” here might be the developer, or it might be business people demanding things from said developer.

In scenario 1, IMO you should be working on a branch. Then, when you’re finished, you squash your commits and replace the countless mini-messages (“fixed”, “Oops”, “wtf?”) with the actual message you want to be there when you merge it.

In scenario 2, especially if it’s driven by business, you’re probably SOL. In this instance, however, I tend to feel like people are making more work for themselves. If they stopped and thought it through for half an hour before starting work, it might only take an hour’s worth of work and one commit, instead of a day and thirty commits.

Of course, there are always shades in between. :)


None of these, the granularity of the changes are just smaller.

Yes, you could argue that we should go with preview envs and only merge larger changes into main. But then again, this adds considerable complexity to the infrastructure – something that might be merited when we scale to 10+ software engineers.

This is the nature of products where you work close with designers, POs, etc.

You simply don't don't do this effort to update text, positioning, colors, etc.

In particular: Remember that git is _not_ just for kernel-style projects.


> On my work I make 1-15 commits a day. If I have to spend thought cycles on the commit message, that is time that goes from other productive endeavours.

Do you apply that to everything? Like not answering questions from your colleagues, not writing test, not refactoring, not optimizing, etc?

I personally don't measure my productivity by the number of commits I push. If I did, I could easily make 100 commits a day. And there of course it would be better for me to not care about the commit description, because it would take thought cycles and anyway the commits would make no sense.


Like the sibling comment, this comment reads like a person who don't realise the breath of projects git are used for.

Not answering questions from my colleagues? No nee to be snarky, lets keep a good tone here.

Small refactorings is a good example of some code I would not write long commit messages. Like going through a function improving its clarity and adding comments – I would not redo that effort in the commit message. Text updates, style updates, etc. are also things that rarely merits big messages.

Great for you that you don't make 100 commits a day – but watch out that you don't mix disparate changes into a single commit.


> Not answering questions from my colleagues? No nee to be snarky, lets keep a good tone here.

I didn't mean to be snarky, sorry if I read like that! I was trying to list examples of "non-coding" that I find are important :-).

> Small refactorings is a good example of some code I would not write long commit messages.

I totally agree! Now I am starting to think that we all agree here. I was just confused because your comment seemed to disagree with its parent, which says: "Sometimes that context fits in the subject line. For others, I expect an elaborate body".

But apparently you do agree with that: sometimes it is worth writing a long commit message, sometimes it is not. It depends on the situation, and then it's a matter of common sense/experience.


> On my work I make 1-15 commits a day. If I have to spend thought cycles on the commit message, that is time that goes from other productive endeavours.

I make roughly that many commits a day as well. If something's easy to understand I'll put in a simple commit message (e.g. [1]), but I do put in the effort for more complicated ones.

[1] https://github.com/nextest-rs/nextest/commit/efd194b2e1d8d61...

[2] https://github.com/oxidecomputer/omicron/commit/b07a8f593325...


> I think, as the original commenter also wrote, this might be worth it in much slower paces projects that is run in another cadence / over mailing lists.

Very much this.

If I'm modifying some rather obvious and ovreall simple thing like an obvious config of a grafana, adding a customer to a config and such things... it's hard to really bother with a long commit message. Also, with modern tools like VSCode with the Gremlin plugin, I don't think I'd have spent many words on removing a weird whitespace from a code base, to be honest.

On the other hand, if I've spent 4 hours thinking and 2 of those hours discussing the change with another DBA changing a 2 into an 8 in the config of an SLA-critical postgres cluster... spending 10 minutes on a commit message in the config management is - with regard to time - a footnote, irrelevant and inconsequential.

But it can be worth more than gold down the road if you ask "Why 8? Why not 6!"


Until it's 7 years later, the original developers are gone, the ticketing system has changed twice, and you have no clue why something is the way it is.

When you're committing is exactly when you already have the context of "why" loaded and even a short explanation should be quick to write. The thought cycles argument feels lazy unless you're doing a bunch of quick exploratory commits and clean up/squash your git history later and add context once a solution solidifies.


In that case, why should the commit history be the place to go? Commit histories are extremely exclusive – everybody not a part of the programming process will be locked out of that information. That is not fair.

Regardless, what you describe is more an organisational failure than an issue with commit messages.


> In that case, why should the commit history be the place to go?

In my experience with open source projects, the history is very much where I go. Say I read some code and don't understand a line (say it is weird, but it does not feel like complete garbage because the rest of the code is actually good), then I will definitely `git blame` or even start digging in the history to see where that line comes from.

Good commit messages have saved me more than once in that situation. It doesn't have to be a whole essay, but something meaningful. Something like "apparently Travis CI wants two white spaces here" is already useful: it says that back then, they used a CI called "Travis" and it required that weird extra space. Now I feel safe removing it because the project does not rely on Travis anymore. (For example).

Note: it could be in a comment. But comments rot, move, get out of sync, disappear. It's much harder to check all the revisions of a file in the last 7 years to look for a potential comment on another line than it is to find the commits that actually edited the line of code.


> If I have to spend thought cycles on the commit message, that is time that goes from other productive endeavours.

This is a bad excuse against writing proper commit messages, since it can be easily extended to user and development documentation. If you want to classify these as productive endeavors while commit messages as non-productive, it basically boils down to doing as little as possible that you can get away with.

> On my work I make 1-15 commits a day.

That is hardly hectic enough to avoid good commit messages. I have seen people writing good commit messages at much higher commit rates. Frankly, good commit messages are actually time savers if you have a high commit rates.

> I particularly think that high paced application development do not benefit from git as documentation.

Things like good commit messages and a lot of other best practices are completely avoidable in the name of high pace. However, the time savings are marginal compared to the quality you sacrifice.


I’m surprised that you (in particular) would say this. git-log is, to me, fine for displaying the whole message (not just the subject). And sure, I often fiddle with copy-pasting SHA1s like a caveman, but it’s fast enough for some quick history spelunking.

Finding the history of a particular code change is even more manual for me: maybe doing a chain of `git log -S'line'` where `line` copy-pasted in at every step. But doable and not a time-sink for my off-hand what’s-this thoughts. (But: something more convenient that isn’t an unreadable Unix pipeline one-liner would be very nice.)

My litmus test is simple and doesn’t involve hallucinating that other people are even reading my messages: am I reading my own past commit messages? Yes. I am curious why I did or didn’t do something on a daily basis(!)


To tack one additional problem onto your excellent list: the commit message is usually only the start of a conversation about why a change should be made. The rest of that discussion is whether it meets the bar and what needs to be adjusted before it can land on the collaborative trunk. Done well, that is valuable reading.

Git was designed with the distributed viewpoint. A commit message, as written by the author, is necessarily correct: I’ve decided this is right, and it’s on you to decide if you want to merge it into your history too.

In our current systems we usually have a URL in the commit message that links to the actual story behind the commit — the discussion on the pull request, merge request, or code review. I rarely see the results of these discussions being amended into the commit message. If the repo lives forever but the database behind the code review tool gets toasted then something just as important is lost forever.

(I come from a background of one idea equals one amended, fast forwarded commit to master. It’s possible other people rely on branch history to reflect the evolution of ideas and how they go from a request for review to approved code. In my experience branch histories tend to have very low quality commit messages and even then they only show one side of the conversation — the author’s responses to their reviewer’s and their own critiques.)


> I don't know exactly what the answer is, but the sad truth of Git

> is that writing amazing documentation via commit message,

> for most communities, is almost entirely a waste of time.

> It's just too difficult to find them.

I completely agree that well-written git log messages are goldmines of information.

I wish makers of popular git forges had made it easier to create and consume this information.

Almost all my wiki pages start with piping git log messages into a text file.

Git logs are the entry point to good project documentation.

(edit: fix formatting)


To be clear from reading some of the other comments, I don't work at GitHub anymore so while I may have partially caused the issues I'm complaining about, I don't have the ability to fix them anymore.

Also, while most GUIs and editors have blame capability (as does GitHub actually), most of them don't ignore whitespace changes (-w), code movement or renames (the -C options) so they're often of limited use.

Finally, I _would_ like people to write good commit messages, I just would like to see a tool that actually uses that work in a way that helps document your code in an easy and valuable way, and the Git/Hub tooling makes that process at best "tedious" as someone in the thread says.

I am working on a new Git client called GitButler[1] and would like to address this at some point down the line, so maybe it ends up being me who helps fix this after all :)

1: https://gitbutler.com


In my experience it all depends on what kind of codebase it is (product? library/framework? private company? opensource?), commit velocity, release cadence & how the codebase is used in general.

In low-velocity opensource libraries, good and clean commit messages can be really helpful when debugging arcane issues. I used to be maintainer of a frontend framework & widget library and we tried to have good commit messages as we'd often go back when over old commits when fixing bugs.

I agree that using git from command line for blame is not easy, this is something I always do from GitHub UI instead.

When GitHub is the repo's choice for PRs, and the codebase is product codebase with high velocity, having a pristine git history and clean commits and commit messages is not practical; however, the expectation should be to at least have good PR descriptions. When blaming commits in GH UI, it's easy to go to the PR which introduced the commit (it's linked below commit title); and PR descriptions can be enforced via templates in .github folder.

PR descriptions have an advantage that they can use images, videos etc. to better explain what they change. This is especially useful for frontend codebases.

I work on a big frontend monorepo. We have tools in place to do visual bisect between pull requests (each PR gets its own preview env). We very much do read PR descriptions when doing bisect to confirm which of the recently merged dozens of PRs introduced a regression in production N hours ago.

But in general I agree that commit messages are not good place to storage general knowledge (they're good for "what and why is changing here"). For documenting gotchas etc. I prefer to have code comments in relevant places of code; or README.md in subfolders. (Sadly, I notice most programmers just don't document anything anywhere at all).


One tool that I think promotes commit messages like the OP is magit in Emacs. Before using magit, I always used `git commit -m '...'` and didn't realize that commit messages could be longer than a line.

I agree that this is a tooling problem, but magit is a breath of fresh air in many ways (including verbose commit messages).


What I like about magit is that it shows me the diff of the would-be commit when I write the commit message. And also that I can pick which sections of the diff to a file I want to include in the commit.

I used Vim + git CLI before and this was much less convenient. (I never tried fugitive though. It might be similarly great on these two features.)


While this may be how most people interact with git, I couldn't disagree more when it comes to my personal use.

I use 'git blame' (I've never needed to pass any options to it) and 'git show' liberally if I'm trying to understand a change that was made, and if the committer took the time to write a commit message body, of course I'll see it and read it.

> ... I think why people just don't care much about good commit messages. It's just not easy to get this data back once it's written.

I think people don't care much about good commit messages because they are unprofessional and sloppy. They just want to get the commit in, push the PR/MR, get it reviewed and merged, close that Jira ticket, and get credit for those sweet sweet story points (ugh). And on top of that, they generally don't care to document their changes because they personally don't see the value of doing so. Surely they'll remember the change if they ever revisit it (no of course not, but many people think they will), and they don't really give much thought to the possibility that others might need more context.

And besides, all the discussion about the bug or feature or whatever was happening in the bug tracker, so providing a link to that issue in the commit message is enough, right? (No, it's not; I hate it when people do that and think that's all they need to do.)

> The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line.

Then maybe this is GitHub's fault; fix your web UI, then. I avoid GUI interfaces to my dev tools as much as possible, and I think the git command line is perfectly fine for this. It absolutely does not only show the first line, generally. 'git log', 'git show', etc. give you the full message by default. In general I would say you have to go out of your way (by providing more command line options) to hide the message when using the command line tools.

> the Git commit message is a unique vector for code documentation that is highly sub-optimal.

Sure, because it's not a vector for code documentation, it's a vector for change documentation. And there's no better place to put the description of a change than in the record of the change happening.

While I agree that many people write very poor commit messages, I don't think the tooling and discoverability is why.


> Even if you're _very good_ at Git, finding the correct invocation of "git blame" (is it "-w -C -C -C"? Or just _two_ dash C's?) to even find the right messages that are relevant to the code blocks you care about is not widely known and even if you find them, still only show the first line. Then you need to "git show" the identified commit SHA to get this long form message. There is just no good way to find this information, even if it's well written.

The good way to browse git blame a read commit messages is to use Magit. It is also great at letting you seamlessly rebase/split/merge long patch series.


In practice, I think GitHub/GitLab/etc solve this UX problem pretty well. Inline git tools let you jump immediately to the PR that generated the code change, and the PR description + code reviews + snapshot of the commit help to understand what the point of the change was. You can search the PRs when you want to find some context. (It's unfortunate that PRs are not stored in the repository itself. I mean, Git is not a great database for a multi-user webpage, so this wouldn't quite work... but it would be nice if the archive was durable and easy to export/share.)


> The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line. So in the case of this commit example would be the very simple message of a generic "US-ASCII error" problem.

This is a feature, and a crucial one. No one would include fifty lines of explanation if everyone had to see it. It would be better to throw the information away than to inflict it on everyone who was scanning through the commit history looking for a particular change.

Yet it is valuable information that only makes sense in the context of that change. There is nothing in the corrected version you can connect to the issue that was fixed. It's obnoxious to include comments about errors that have been removed, like this:

    # where civica QueryPayments calls are taking too long # use ASCII whitespace
(This is ridiculous, but not unrealistic. I've seen code comments that said things like "# removed syntax error in invocation of query generator." This is what you get from programmers trying to juice their LOC stats.)

The commit message is the right place for this kind of information, but most people reading the commit messages don't care. They're scanning through looking for something else, and all they need is a few words that tell them if this is the commit they're looking for. The person who needs to see the full story is the person who is interested in this change in particular. Maybe they found it by grepping the git log for "invalid byte sequence". Maybe they found it because they're looking at all the changes in that file, because some tooling that occasional modifies that file keeps messing it up. What matters is that if they have a special interest in that change, they have a way to see whatever information they committer felt was worth preserving, and the committer has a place to put that information where only someone with a special interest will see it.


I feel half vindicated about my rant a few weeks ago[1] arguing that we should make commit messages as long as we like instead of the stupid 50 character or whatever limit. If enough people do that, maybe tools like GH will stop wrapping the message by default. Even if not, atleast the first line is usually easy to see in most tools by hovering over it or something.

[1] https://news.ycombinator.com/item?id=38831282


Presumably you're referring to commit subjects.

And no, they should absolutely not be as long as you like. It breaks things


Making sense of code (or any system that changes over time) vis a vis its own history is one of those things where I really think AI/ML tools can really shine. Even with relatively low quality commit messages, I can look at something that happened 15 years ago in a codebase I am familiar with, and there will probably be enough information that I can assemble the full context, even if finding some of that information is challenging or time consuming. git log, git blame, look at the other code made in the commits, read the issue descriptions, read the code reviews. It just seems like a model could slurp that up and do a decent job of giving you a couple of paragraphs about why the line of code you are staring at is the way it is.

TBH putting such a detailed writeup in the git log doesn't really have any return -- for it to ever be useful to you again, you have to know the information is there; you then have to actively seek it out, with the hope that whatever you did to make it 'searchable' is going to work for you again. I can say with surety that if I were looking at a bug similar to the one linked from this article, I would not look to the git log for inpsiring a fix; I'd just fix it. Any extra time I would take would be to understand how a UTF8 nbsp ended up where it shouldn't have been in the first place -- something that the author of this commit seemed to have no interest in doing, but which likely has greater relevance than the documentation of the fix.

I want to be clear that I support commit messages that say what they do though; I'm not advocating for -m 'fixed' shenanigans, however at the same time I believe that -m 'fixes #1234' is often enough


Of course Emacs has a mode for it:

https://github.com/redguardtoo/vc-msg


I take Scott's point with a difference perspective.

Though commit messages are ephemeral and hard to utilize in the future, they're the stream of consciousness of the project.

They convey very important shifts in direction, discoveries in the making, code smells, limits of current architecture, and markers of tech debt. We don't know what this beast will be. And we figure it out commit by commit. Document it.


Commit messages are the very opposite of ephemeral; they are the longest-lasting history a project is likely to have!


Yes, I misworded. The usefulness of commit messages, Scott's point.


Completely agree, the value with the message is really just to link an external ticket Id, the user experience is much better in external ticketing systems for all of the story telling that the article loves.

Don't read "external ticket system" as closed either, plenty systems are open to the public.


Right. The massive commit with minimal description and a PR number which I can look up in Azure DevOps to find a review with no description, no discussion and a mention of a number I can go and look up in Jira, where some Scrum master wrote half a sentence of what needs to be done and asking to "reach out to Jeff" for explanation.

So much more valuable and great user experience


Just wait till next year when your employer migrates away from Azure DevOps and that PR number will be a dangling link forever lost.


Or you just move the repository to a different protect


git log. git blame, grab hash, git log hash. You make it sound like some arcane magic...


It's amazing, your experience with Git is so different than my own.

I routinely open a file in my editor, hit "Ctrl-c v B" for Git Blame mode, go to the line I'm interested in, and hit "Enter". Bam, there's the full commit message. From there I can can continue to trace backwards, blaming lines and reading full commit messages.

But, you know, not everyone uses Emacs and Magit, fair. How about just using "git gui blame file"? Click on a blame line, see the full commit message. This is a tool included with Git (available in a separate package in some installations).

OK, rather use an IDE? Install GitLens in VSCode. Easily accessible blame in your editor, where you can hover or click in various places to see full commit messages.

I mean, I agree in part; there are some tools which make good commit messages hard to write or find. The tiny little commit message edit box in VScode is not ideal. Lots of people use a workflow of "commit lots of crappy commits with one liner commit messages, let GitHub/GitLab squash them on merge."

But as an expert Git user who has managed to convince some teams to have a good commit message culture, if you do get people used to writing good commit messages, they can be very easy to find and read later on, there are tons of tools that make them easy to browse.


I regularly use the git command line, and "git show (pasted SHA)" in my second terminal doesn't really feel like the road block to understanding the grandparent seems to make it out to be. It takes me many orders of magnitude more time understanding what is output rather than searching for it, and like you mentioned there are any number of UIs (third party, editor integration, or even shipped with git like gitk) that wire everything up into a nice UI.

And I also disagree with the GP's complaint that "Most people only read the shortlog" being any kind of disadvantage. The commit message isn't for everyone, it's for the one time someone needs to figure out exactly what it did and why that commit was made, and why a change in X causes a behavior change in Y, and can save hours of work. It's like code comments, 99 times out of 100 you don't need them as you're just interacting with a documented API, but that 1 other time they are a godsend.


I use fugitive.vim, and blaming is very convenient there as well as every other git workflow. I can press a shortcut to see when every line in the current file was changed, and who changed it along with the commit hash. If I need more – I can expand every hash to see the full context, including full commit text and diff. Maybe cli git is not too easy to use since how complex it is, but there exists a git wrapper so awesome it should be illegal


> The main issue is that most of the tooling (in Git or GitHub or whatever) generally only shows the first line.

This has been an issue with version control tooling for quite a long time. I'm fairly certain both CVS and SVN did the same thing. But I agree that you're still right.

I'm also very amused by the number of replies to your comment along the lines of, "oh, it's actually very easy because I always use <third party tool>".

Which is, of course, rather proving the point.


No, because the point is about common tooling, and the common tooling does not, actually, make this difficult.


That's rather a matter of opinion. One that I and GP both clearly disagree with. Insisting it's easy doesn't mean everyone will agree with you.


Author of nit, here. I tried to move the landscape towards semantic reasoning. It’s on github but kind of abandonware. Life and incompetency happened ;)

No shilling. I commented here because I still think my framework was decently thought out, and mostly that calling someone a nit or a git is exactly what linus was thinking. Make it easy enough for anyone to use.

Nit is something people could take as a thought experiment.


So then I am not wrong that I do all my git commit messages via the "-m" commandline option with a short phrase like "frob the baz"?

(Initially I started using -m to avoid getting trapped in Vim. But even after I gained the option to use e.g. Notepad++ as the editor, I never saw the point in using anything more than "-m 'message'".)


Git respects the EDITOR environment variable and has done for decades (so likely before many here really used it) - you should probably be setting that (or equivalent on your platform) to the editor you want anyway.

Weird workaround just to avoid basic configuration seems like more work in the long run.


I agree for the use case of scrolling through a git history, yes, but when I land at a certain commit, e.g. by hitting the blame label in IntelliJ on a line whise reason d‘change I‘m interested in, then I will totally read the whole commit message in the hope that it helps me understand the change (in addition to looking and trying-to-understand the change itself).


> Everything they talk about in this article is what is great about the _rest_ of the commit message, which, given modern tools, is _almost never_ seen by anyone.

This was why I created gh-ph [1].

[1] https://github.com/Frederick888/gh-ph


If you follow a pull-request based workflow, and if you typically squash down to one commit, then finding these messages isn't too bad, since the commit description pre-populates into the pull request description. I often track changes down not to their commit, but to their pull request.

Granted, that's not exactly `git`, but rather `github`…


I feel like this explains a lot about why GitHub is so consistently hostile towards showing or writing decent commit messages.

Which has helped push people away from writing useful ones, on an unprecedented scale, which makes it a self-fulfilling prophecy.

Great.

Just great.


This sounds like an excellent sales pitch to use email based good workflows such as those advocated for by Drew DeVault[1].

1: https://git-send-email.io/


Just do a threaded conversation in a comment at the top of each file. Add your name and the date.


The reason people (myself included) rather like good Git commit messages is evident when one compares them to the alternative.

You're working in a commercial/closed source environment and want to find out why line 57 in src/blah/db/utils.py does that. Where do you look?

- inline code comments. Usually non-existent. Often out-of-date, sometimes misleading, frequently tells you no more than you can discern from just reading the code itself (especially now type annotations are trendy again). Rarely explains why the code exists. There's a reason people caution against too many comments, and that translates into people probably not putting enough commentsin.

- calling code? Helpful, but thanks to microservices and increased levels of abstraction (APIs, DI frameworks, messaging buses, config parsing) you've got to go check 900 different repos out to work out what is going on.

- email? Give up. You'll find invitations to the company Christmas party and Q2 sales figures but actual tech explanations are in short supply.

- Slack etc - same problems as email, plus developers who hide away all the interesting stuff in private team channels

- Google Docs - you probably don't have access to the relevant doc, and there's no way to know that you don't

- wiki/docs? Half baked, wrong etc. Or it'll be autogenerated JavaDoc type stuff that'll tell you what you already know or can reasonably infer from the code. Also, findability sucks. Or the developers just avoid the whole thing because the software is nasty and corporate and barely usuable.

- bug tracker/ticketing system? You ask around and someone says "oh yeah, Dave made that change two years ago" and then you search for tickets that match related keywords only to find out that those tickets weren't brought over from Trello into JIRA, and now you need to go ask IT to give you access to the legacy Trello board which they don't want to do because then it'll put them over the five users per month limit or whatever.

- Architecture Decision Records / decision logs / whatever you want to to call them - nice if they exist, I guess.

- ask the person who wrote it? This assumes they still work there and can remember. Plus you gotta do the asking around routine which takes days and destroys all hope and joy in the world.

By a process of elimination, commit messages are the closest you're going to get. They're right there - on your computer, neatly integrated into your editor, hopefully. You can search them fast in a terminal window rather than in some slow web-based monstrosity. If you're lucky, they're actually useful. Even if they aren't, they're at least contextually useful in helping you narrow down your search strategy for the inevitable plunge through email/slack/JIRA/Trello/internal wiki etc.

Ideally what should happen is the really useful commit messages get copied into stable technical documentation like decision logs or a properly maintaned wiki. If people did that, great, but it's pretty rare. A culture of sharing weird interesting tech things in a Slack-type system can help because future devs can at least search but you do that at the cost of more interruptions for colleagues now.

The broader issue is of all the bad options you can choose, it often tracks the wrong thing. In something like Trello/JIRA/whatever, if you're looking for the technical reasons, it'll have the business reasons without the technical stuff, or vice versa. You generally want both, and most systems only give you half the story.


I know the OP didn't mean it this way, but after reading HackerNews for the last decade or whatnot, it never ceases to surprise me how often developer complaints stem from developers just not doing their damn job.

"Almost nobody ever sees it.... nobody reads anything other than the first 50 chars of the headline."

On the one hand, I get it. If a tool makes something difficult, people are less likely to do it, and as engineers we want to make tools to cause people to fall into the pit of success. So, improving this part of git makes sense.

On the other hand, just do your damn job. If a coworker doesn't understand a code change, because they didn't bother to read the commit message, they're a bad developer. If they didn't write a git commit message because "no one is going to read it anyway", they're a lazy engineer. These things aren't excuses, they're incompetence, and not everything needs to cater to the least competent people in our profession.


When I document, or write commit messages, I don't really _care_ if other folks will ever look at them. Documentation is a gift for future me. If something wasn't obvious to figure out, or a potential source of future problems, I want it written down, so if _I_ go looking for info, it's there.

The fact that things are now documented for other folks is just a side benefit.


This.

I write documentation for me.

Very few folks ever use my published code, which is fine by me. I publish it, because treating my packages as atomic, ship-ready, high-Quality products, forces me to take great care, in each and every one.

Which means, when I use them in my other work, I don't have to worry about them.

My take on documentation is thus: https://littlegreenviper.com/miscellany/leaving-a-legacy/


Documentation is for everyone. Put it where everyone can read it, not hidden in the most esoteric place possible, via a very unfriendly tool.

I do that often and call it the developer guide. After the user and ops guides.

Not to mention comment and doc strings about why in the current code.


This entire thread is making me feel like I'm taking a crazy pill.

A simple git log is considered "esoteric" these days? No extra command line arguments are required to read the entire commit message. If so software "engineering" is truly a dead discipline. I guess the "move fast and break things" crowd have taken over.


Never worked with other people? Git as a thing is off the table.

Which commit? One month ago or three? Pro{gram,ject,duct}M, SME, QA, or user wants to know why? Do they even have a login to the systems they’d need? Is there search so they could find it themselves when you quit?

Or you could send them a link to the docs.


Not sure why you're attacking me. I don't remember saying anything offensive.

Anyway, I've never been a fan of "moving fast and breaking."

Might want to give that blog entry I linked, a read.


I feel like it's not a question of "doing your damn job". It's a question of what value can you expect to get from a particular investment. If blame is your tool and every line happens to be changed from a different blame invocation (is it "-w", "-w -C", "-C -C -C", etc), how do you learn the story of this block of code best? Maybe you then need to read a story _per line_ of code. But that's not actually worst case. Maybe you need to drill down to the commit _before_ that because the last change isn't semantically important. Maybe the one before that, etc. How many commits that touch those lines significantly do you need to research and read amazingly well written commit messages before you totally understand the context of this particular block of code?


Coding is a really interesting field in how quickly it's developed, and I think there's a lot of people who assume their environment is the only environment and it should be that way for everyone.

Spending time digging through commit messages when tooling and design makes it harder, not easier, is a risky proposition if it turns out it was all fucking useless and you didn't find anything worth reading and are now even farther behind. Either due to the quality of the messages or a lack of your ability on your end to find what you need.

I'd love to work in the sort of environments these people seem to but it's just not been the case. I'm at a smaller company where I get to wear many hats, and coding/development is just one of them, but I know plenty of people at very large companies who also don't really do things like that because "putting in the effort" isn't rewarded as much as whatever arbitrary metric they're graded on.


Git visualization isn’t git’s job, providing primitives like blame for git visualizers is. If you use VSCode with gitlens (pycharm is similar), the exercise you just mentioned is trivial. Focus a line, get the correct blame. Click on that, view the commit history. Click from there, see the diff.


> it never ceases to surprise me how often developer complaints stem from developers just not doing their damn job.

One thing I learned is that any forum that appeals to software engineers will appeal to software engineers of all skill levels, from the guy that did a 6 week coding camp because he heard SWEs make a lot of money but didn't really learn anything but thinks he's an expert now, to geniuses with 10+ years experience.

For every comment from someone who really knows what they're doing, there's one from someone that really doesn't.


Yep. This is one of the big problems with online communities. When someone makes a bold statement, I have no idea if they’re a grizzled engineer with grizzled, hard earned engineering opinions, or some kid fresh out of a coding bootcamp who thinks they’re all that. In person, I’d treat those two people incredibly differently. Online? It’s impossible to spot the difference.

It doesn’t help that we all think of ourselves as programmers, even though people in our industry have a wide range of jobs. Someone working at a feature factory banging out websites and mobile apps has a very different job from someone slowly puzzling out a new cryptography algorithm or debugging a kernel driver. You can tell they’re different jobs because excelling in those roles takes different skills. In the first case, you want to know your domain backwards, have great social skills and work consistently. In the later cases, you need deep CS knowledge, patience and insight. It’s different.

Who is this site for? What does everyone do for your job? It’s all quite unclear.


> Yep. This is one of the big problems with online communities. When someone makes a bold statement, I have no idea if they’re a grizzled engineer with grizzled, hard earned engineering opinions, or some kid fresh out of a coding bootcamp who thinks they’re all that. In person, I’d treat those two people incredibly differently. Online? It’s impossible to spot the difference.

I know there are limits to this, but isn't that a good thing, too? It's all too easy to treat a newbie as if they can't possibly have useful things to offer to the community; being forced to treat all comments equally, without being able to fall back on the crutch of reputation, arguably forces one to read and engage more deeply with the content, and offers the chance to surface the occasional genuinely valuable contribution from a newbie (or, for that matter, to avoid letting someone get away with an ill considered or overbroad statement just because they have such a big reputation that people are too afraid to stand up to them).


True, but occasionally a newbie offers a solution to a problem and rejects the rejection when people try to tell them why their solution won't work.

For example, several years ago here on HN, during a thread about cryptography, someone admitted to not knowing much about cryptography, but offered one-time pads as a solution to the weaknesses of PKI. I tried to tell them that OTP solve a different problem that is unrelated to PKI, but they wanted nothing of it. They claimed that my response was purely emotional and that just because I knew more than them about cryptography doesn't automatically make them wrong. When I tried the Socratic Method to lead them towards an understanding of why they were wrong, they accused me of being condescending and said that I should answer my own questions.

If I see a bold claim that I have a hard time believing, then I'll ask follow-up questions. But when someone makes a bold claim that is just factually incorrect, while admitting they don't know much, and then get upset when someone tells them why it's incorrect, then that's just plain frustrating.


> If I see a bold claim that I have a hard time believing, then I'll ask follow-up questions. But when someone makes a bold claim that is just factually incorrect, while admitting they don't know much, and then get upset when someone tells them why it's incorrect, then that's just plain frustrating.

But that doesn't seem a situation that would have been addressed by fixing the problem that you originally identified:

> Yep. This is one of the big problems with online communities. When someone makes a bold statement, I have no idea if they’re a grizzled engineer with grizzled, hard earned engineering opinions, or some kid fresh out of a coding bootcamp who thinks they’re all that. In person, I’d treat those two people incredibly differently. Online? It’s impossible to spot the difference.

Here, it sounds like you knew on which side of the divide a person fell. The resulting problem is still a problem, but it's one that also occurs offline!


Once I started interviewing - mind you, interviewing candidates that already got through several filters before getting in front of me - I realized how mediocre the average engineer is. Should it then follow that the average engineering opinion online is mediocre?


pretty much yes, even here you see a lot of overconfidently mediocre takes


It’s important to know how to stick to your course in the face of bad advice.


Any system where the proposed solution is "be better" without an outline of "and here is how" and some method of enforcement is doomed to fail.

Checklists, build checks, linters, tests, SLOs, post incident responses, follow up tickets, etc all serve to unload "be a better software developer" into actual systems and processes that can continuously enable the better behavior.

Simply stating "do a better job" wont work as organizations scale. Related, you can expect what you inspect.


There were two hands in that comment (on the one hand/the other). One hand said that Git and other tools should be better. Only the other hand said to be better.


> If they didn't write a git commit message because "no one is going to read it anyway", they're a lazy engineer.

If an engineer spends an hour writing a commit message that no one reads, that's an unproductive engineer, compared to where they should be.

I have to admit, I am lazy. I don't spread seeds by hand; I use a tractor. I don't swim across the ocean; I use an air plane. Likewise, I don't write documentation in commit messages; I write documentation in PR descriptions, READMEs, and official document sources. You got me, I'm incompetent.

My "job" is to write software, not follow some arbitrary "pure" practices.

> If a coworker doesn't understand a code change, because they didn't bother to read the commit message

And I would argue we shouldn't cater to developers who make documentation difficult to access for everyone else by hiding it where only crappy tools can reach it.


> If an engineer spends an hour writing a commit message that no one reads, that's an unproductive engineer, compared to where they should be.

Okay, maybe don't spend an hour. It would take a special kind of commit to need more than a few minutes writing a decent commit message.

> And I would argue we shouldn't cater to developers who make documentation difficult to access for everyone else by hiding it where only crappy tools can reach it.

Yeah. Like web browsers. And PDF viewers.

The non-caustic point here is that clearly different people have different ideas about what is accessible.


If writing good commit messages isn't specifically defined as part of your job, why would you waste business hours writing commit messages that are beyond what is expected of you and frankly useless since nobody would ever read it anyway?


In my opinion PR/changeset description is exactly what should be in the commit description. In cases were we had 1 commit per PR (i.e. squashing before merging) just copying the PR description into merge commit worked really well - the goal for a PR description and a commit is essentially the same.

I wish github allowed to make the copying automatic and ensure that it happens (it doesn't, unfortunately).

If someone wants to learn the full history - the remarks during code review, perhaps all the WIP commits - they can read the PR/code review comments. I found it to be very rarely needed.


This is exactly how Microsoft Azure DevOps works when you enable the squash-on-merge behavior (which is how we used it while working at Microsoft). I thought this was completely logical and I'm surprised that GitHub can't be configured the same way.

All of our commit messages were nice, long and detailed, with a link back to the PR if you really wanted to go back and see the individual commits and/or discussion that occurred on that PR. I think I only looked at individual commits maybe once or twice since they were usually useless in isolation (woops, WIP, fix typo, etc.).


settings > allow squash merging > default commit message > pull request title and description


Oh, nice, thanks. I haven't been using github recently, it looks like this option was added ~1.5y ago.


Writing good commit messages is part of your job in the sense that no reviewer should be approving anything without them, knowing that you may or may not be available if it breaks next year at 3 AM.


I put that kind of thing in the pull request, since that's where the review happens. Every commit links back to a pull request, and people actually do refer back to it. Writing good documentation there is part of the job.

No-one's going to ALSO write commit messages that no-one will see.


The git history will last longer than the platform hosting the PR.


Not only is “git log” always up, I can also skim it without opening a hundred browser tabs.


pull requests don't live in the repository (as far as I can tell...) and require you to use whatever online interface creates them.

Not sure whey e.g. Github Pull request merges don't include the entire pull request description in the merge commit message.


In my company we norm the titles of MRs, give it a ticket number and squash it all. If you can’t concisely describe what you did you need to split your MR.

Helps with reviewing as well as blaming.


I'd do it for CYA purposes in case smt goes wrong and my commit is involved.


Why would I waste time reading this paragraphs long commit message when I can look at a diff and a 40 character headline and completely understand the issue? You think it's lazy, I think this is wasteful. Personally I don't need an epic story about making a one character change because your editor isn't configured to catch gremlins... it's just not that interesting.


Because you don’t actually understand the subtleties of the side effects of that 40-character change and building intuition about it takes a paragraph.

It’s all fun and games until your codebase is >1,000,000 loc.


This code never worked but made it into production. What I see is a developer hucking garbage over the wall, not testing their own code, passing reviews assuming they exist, and eventually stopping the train in its tracks because they're more concerned with pretty commit messages. I also think this is beyond simple to catch early be it the editor, the pre-commit hook, or any other range of tools that could have and should have prevented this.

I'm not saying a detailed commit message is never warranted, I'm saying this fuck up doesn't warrant a short story let alone a prize for being overly verbose. BTW, I did a loc . on the repo I work in, came back with 7400000 lines of code. Does this mean I'm cool enough to be in your club?


> Does this mean I'm cool enough to be in your club?

If you have to ask, then the answer is no. I don’t make the rules.


This is the problem with any kind of documentation; while you can write the highest quality, meticulous, most obvious and clearest prose, it's moot if nobody reads it.

And nobody reads it because there's so much of it and there's no clear starting point. People just want the summary of what they're looking for.

I started to learn Java almost 20 years ago, we had a text book and everything. After the first two chapters, I learned how to google and instead of reading everything, just find what I need. I never went in-depth with reading because... it's mostly useless knowledge that quickly becomes outdated.


With commit messages, there is a very clear starting point: the commit message for the commit that last touched the line of code you're looking at with git blame, which is my standard solution for finding out the reasoning behind any piece of code I don't quite understand. Only works for projects that don't destroy their history with squashes or otherwise write uninsightful commit messages (e.g. "fix bug").


Nah, we have documentation with a clear starting point - there is index page with most common topics, "new user" page with links to what new user should read, and some error messages actually contain wiki links to pages with instructions.

And yet we still have people who don't read documentation.


Counter point - Engineering systems which require constant overriding of basic human nature therefore requiring making significant effort on the regular to avoid mistakes is bad engineering.


I'd hate to say that laziness makes a person an incompetent developer. Often my problems stem from an excess of sincere hard work and rather than from laziness.


But if you don’t cater to the worst on your team you are often viewed as the problem


The "just do your damn job" retort presupposes that their job is to read the entire body of every git commit. That's the question - you can't just presuppose it.


Overall agree with the sentiment, but I would add a more specific Bottom Line Up Front (BLUF) such as: "Fix test issues caused by non-breaking space character \xa0".

Tells me exactly what the problem was straight away, but I'm still free to choose to read more if I want to know more.


Yep this message is way better. And honestly, looking at the diff in github it is pretty obvious to me what has changed (and why really, since the only reason for a changeset to have a diff look identical is that non-visible characters have been added or removed).

So all I'd require is a good main message for history-search purposes. A short story about how you went to Narnia and came back to find the root cause of a bug isn't really relevant imo but I'm also not against writing it if you just want to vent in a PR description/commit's extended message.


Stories about how you went to narnia and back may be useful to a future contributor who finds themselves in narnia. This is very likely not the last time that an invalid byte sequence will show up in one of the source files in this tree, and if it happens again, it may be good to see the symptoms in the git log.


This would be my ideal commit message as well. The rest of the commit body in the article is just how it was discovered. I don't think describing how one works belongs into a git commit message. Your message tells me why and which change was made, that's enough to me.


I love this concept. I always begin messages with the most actionable or important thing at the top, and the rest that follows is the context. Respect the time of others and don't bury the lede


You see this all the time in business proposals. Executive summary at the top, typically 2-3 paragraphs max. Manager summary, 1-2 pages. Engineer detailed overview, 3-10 pages. Anything else is an addendum.


I have felt that pride in writing a great commit message, but I am less sure of the value to others. I don’t think most people search commit messages when they encounter an unusual error message, or when adding a new feature, or really almost ever.

It’s a bit sad, but I have a growing suspicion that beautiful commit messages are a bit of vanity by the programmer. The person primarily impressed is often the author; others will walk on by without noticing.

There is room sometimes for those aesthetic flourishes but I am not convinced they have much practical value, and I have stopped really being bothered by commit messages of “fix whitespace issue” from others. I think I am a better colleague for that.

Things might be different on a project like Git or Linux with huge distributed teams and tons of commits, versus the projects I am used to which have between 1 and 100 contributors, mostly from the same organization.


> I have felt that pride in writing a great commit message, but I am less sure of the value to others. I don’t think most people search commit messages when they encounter an unusual error message, or when adding a new feature, or really almost ever.

They have value even if the only person who will ever look at them is you - and I will say that when bisecting an issue, the commit message of the commit I finally find is really useful (or it could be if it wasn't fixed thing). It also means that if you encounter a similar issue again, you know that there's a note on a commit you can find.


I agree with this wholeheartedly. If writing a detailed, multiparagraph commit message, assume the target audience is future you.

Most likely, a time-pressed dev on the far side of the world will think your commit broke something and send a 2:00 AM message of "URGENT: code broke CRITICAL customer request" with a link to the commit, whatever JIRA issue they are working on, and zero additional context. They will NOT bother to read the message (likely explaining how they got into their pickle in the first place) but will see your email, send a message, and do whatever it is they do while waiting for someone else to figure out the problem. You, being that someone else, will now have an excellent starting point on the top priority for the day. Much better than if your message had just been "fixed it".


In some orgs, people never run a bisect. Not once a year.

They go as far as squashing out swaths of history into big un-reviewable blobs. Once code has been merged, they never look inside a past commit again.

In spite of isolated (desperate) demands for rigor, it works fine.


I despise squashs. It encourages people to tread git commit as a glorified ^S of their work.

You want to know why a change was made, or who so that they can explain it. You land on a blob of a diff, with no meaningful commit message (any commit message was squashed to /dev/null to be replaced with the MR title and description). And then off you go to the corresponding github/gitlab/whatever MR only to find a wall of "hmmmm" "why no work?" "try something" etc commits.


> I despise squashs. It encourages people to tread git commit as a glorified ^S of their work.

I can’t imagine working with git any other way. Do you hold off committing because you haven’t collected your thoughts enough to craft a good message? What if your editor crashes and you lose your undo history? How do you get back to the last state where the code compiled? If some plan of attack doesn’t work, you just… reset hard and lose it forever?

I can’t count the number of times I’ve done something like the following:

- Try approach A

- Approach A sucks, commit what I’ve done so far, try approach B

- B sucks too, commit again and switch back to A (oh hey, there it is in my reflog!)

- Turns out I need a combination of A and B. Oh hey, a simple git diff shows me the deltas. Awesome!

- Repeat

- Once I’m ready to make the PR, I squash it all, craft a thoughtful, meaningful commit message, look at the commit as if I’m a reviewer, verify it all makes sense in context as a single commit.

(oh and by the end, the commit may be a SINGLE CHARACTER! Precisely what the author of this article is talking about! Is it your contention that every dead-end the author hit should have a permanent place in the repo’s commit history, forever?)

In your world, do you just… not use git at all here? Do you never try approach until you’re sure it’s the right one? You only commit when you have something you want someone else to see? That’s nuts to me. You’re absolutely missing out on some of the best workflow git has to offer.


Squashes and rebaseses, used properly, are done prior to committing work into a shared major branch (like trunk or develop or whatever). The goal of the squash is to make the resulting commit atomic.


This is a false dichotomy. Someone creating the "fix problem" commits is not going to suddenly write great commit messages because the merge strategy changed.

The root evil is actually MRs that live longer than a day or two, and change too much code at once.


Sometimes units of work that a branch would reflect are larger. I agree there are some branches that grow larger than they should, but often there are branches that involve a good number of changes, and breaking it into smaller branches doesn't really make sense either. There will always be branches where there should be multiple meaningful commits, and automatically squashing them all together just defeats the purpose of good commit messages. I don't buy into the idea that branches should always have just one commit or be reflected as a single commit on merge.


Most of my commits are indeed a glorified ^s. Doing this does require some discipline though. You’ve got to pay attention to when you’ve reached a point that the accumulated changes represent a reasonably small, but complete unit of work that should be squashed and documented with a good commit message.


I might be weird but I try to at least skim all of the commits on any project I am actively involved with. If it's an open source project then those commit messages will live on forever. They will even be indexed in regular search engines, not just code search (this maybe not so much now that GitHub is locking out bots more and more)

When I'm trying to solve a problem and not finding results on google or stack overflow, sometimes I search GitHub just to see if a similar thing shows up in PRs or commit messages anywhere (including private repos I have access to search). It's helped me out on countless occasions. Good commit messages do have value beyond vanity, absolutely without a doubt. The fact that many developers aren't looking, that's their loss and hopefully they will see the light once they have enough experience. Maybe teach a junior dev how to search them! Maybe link them to TFA.


I was bitten by too-short commit messages few times already, when someone asks me "why is it done this way?" - I check git history to find my own 3-year-old commit with message "it should be done this way"... Since then I try to write my commit messages so at least future me would get a hint why a change was necessary.


If you change a line of code without doing git-blame on it first you're doing it wrong.

I've been bitten by this many times - I change obvious bug, I'm about to commit the changes, I see the previous commit which introduced the "bug" on purpose and the attached JIRA task has perfectly good explanation for why my obvious change would have reintroduced some bug from 2 years ago :)


> If you change a line of code without doing git-blame on it first you're doing it wrong.

Working on a project where this is necessary sounds like a hellish experience.

The place for comments explaining why the code is needed is right next to the code! On an adjacent line!


> a hellish experience.

It's literally 1 click away. Or even just a hover over the margin.

> The place for comments explaining why the code is needed is right next to the code! On an adjacent line!

And then you refactor the code (from another place) and the function name and parameters change but the comments in adjacent lines remain the same. Ups, the comments lie. After enough time passes it's 50-50 whether a particular comment is still true.

The more high-level the comment - the higher likelihood it lies, because high-level comments are by their nature further away from the code they mention (and they always mention code in many places but you won't copy-paste them everywhere relevant, right? DRY and all). So in reality the comments aren't "on the adjacent line" but "on the adjacent line in one of 10 places this code is related to - go look for all of them".

On the other hand commit message is always on all lines that changed, if you git blame a line - you see when it was changed last and why, and you can be sure nobody messed with it without changing the commit message.


I'm not saying using git blame is hellish, I'm saying always needing to do it because of all of the things you described sounds hellish.

The original comment was that you should always do it, for every line of code.


Well you should. It costs almost nothing and it can have very high positive impact.


Hard concur. When something might look wrong or misleading to a future reader is exactly the time to comment.


Why code is needed is different than why change is needed.


Seems to me that if you're introducing something that seems like a bug on purpose, you should probably have the comment in the code explaining why it's there.


I often use the git blame feature in the IDE to understand what’s been going on. A good commit message will be appreciated, should I happen to find one.


I find them valuable, especially when trying to study a new codebase. In the current era where we get immediate feedback on everything we post online, it's harder to see the value that comes from writing good commits, and the value can be delayed by weeks, months, or even years.


IMO, the primary target audience for good commit messages is the same target audience as good code comments: me six months from now. Being able to read why and how a particular thing was done has helped me in debugging and troubleshooting an issue on more than one occasion.


I both agree and disagree. I think you're right that most commit messages won't end up being seen. But when you do need to see one, having a good commit message can be critical to understanding a change, especially if the person who made the change is long gone by the time you need to look at it. Or, hell, if that person was you, but it was far enough in the past that you don't recall the details.


If anything, this just tells us that tooling should incorporate commit messages a lot more. While these kind of messages are most valuable in large projects, there are some of them in a lot of projects and they could have saved a lot of time.

Especially now with AI IDE integrations, incorporating a software's whole history into supplemental tools would be more useful than ever before.


I agree with you that searching across commit messages happens rather rarely so return on great commit messages might be questionable

where great commit messages like the one in the blog post make perfect sense are pull requests. If the commit message explains the whole thought process that the author had while working on it, it saves so much time on pull request review.


I agree. My view is that you shouldn't write comments because if you have to, then your code isn't clear or organized well enough. If you do need a comment, perhaps to document a "Chesterton's Fence", you should put a big nice comment block to explain why and what's going on.

The reality is people don't like to read, if they do it'll be an overview of how the code is organized, they don't want to read git commits or even comments. The code is the only truth. GPT can already explain in English what the code is doing pretty well already, imagine in 2-3 years.


I think the "code should be self-documenting" view is a bit simplistic.

Good comments shouldn't explain what the code is doing, I agree that should be evident from the code itself if it's clear enough. However, why the code is doing what it's doing, or why it's being done in a certain way and not in a different way, is meta-information that is very hard to express in the code itself, and that's where comments are most useful.


I agree this one goes into more detail than is useful for future reference, most of the explanation would be better off in a PR description. But in general I would rather people go into too much detail than the more common variant of not providing any contextual information anywhere (or only in a chatroom at best) and sticking to one-line commits. As long as the important information is near the top so I don't have to wade through the verbose "this is how I discovered this issue" thing, go crazy.


Worse, PR tools like Azure DevOps (and GitHub?) don't do a good job of displaying the information.

Just a big diff.

I often get asked about the reason for a change in a review comment, even when there's a thorough description in the backing commit.

It's sad. I would prefer PRs over email like Git does


That first line of the commit message is most important so that `git log` can address chesterton's fence. And IMHO in this case the committer whiffed.

The key is not to put what you did in that first line, but why. Anyone interested in what can just look at the code, perhaps via a diff.

So something like "nginx .conf files must be in us-ascii"

Then "changed blahblah.erb to remove nonbreaking space character"

Then the rest of the commit message which is quite good.

Think of it as a news article: write in decreasing levels of importance and increasing levels of detail, assuming the reader could stop reading at any point.


Nah, first line needs to be a summary of what you changed, so that you can find the offending commit in the first place.

A news article doesn't explain WHY in the headline, it explains what.

In this case, the OP's first line is spot on... if you're reading through git log, you can see that this commit likely didn't change anything functional about a test, and you should move on.


Hard disagree.

There's little reason to search the text of commit messages to find out what changed. There are many git tools to find out which commits affected parts of the code you're interested in. Whereas, trying to find that in commit messages is really inefficient and relies on reading, rather than such automated tools.

The purpose of the commit message is to help our fellow humans get a higher level understanding than is available from quickly scanning the code.


Totally agree with you.

We already know what changed: it's the diff! We need to know why you're making the what.

The message should address why the diff is necessary. Was it a bug fix? Ok, what's the bug you're fixing? What's the evidence that you think they diff addresses it? Is it a new feature? What's the requirement? I can already tell what you did by the diff, but I can't tell from the diff alone if it actually matches the requirement!


This is true of the overall commit message, but we're talking about the first line. If I'm browsing git-log short or whatever, I don't have the diff, I have an ordered list of commit hashes and one line summaries, and I'm trying to decide which diffs I actually want to look more closely at.


I've never thought about it this way.

I've always used it as a summary, so I can understand what has been done when browsing the history. It allows me to find changes that I want to cherry-pick or revert. OTOH I can see benefit of describing why a change was done as the summary, as then git blame on the line is a lot more effective.


> Think of it as a news article: write in decreasing levels of importance and increasing levels of detail, assuming the reader could stop reading at any point.

Great quote and life advice, will definitely steal this! Thanks!


But a commit message in an arbitrary project is not where you give someone a lesson about nginx rules.

"nginx .conf files must be in us-ascii" is maybe a good bug or pull request title, but it may correspond to multiple commits that do different things, but it doesn't tell me what's happening. Is this converting a file to us-ascii, is this writing a tool to convert files to ASCII, is this updating documentation, is it creating a test, some combination? Leading with what, not why addresses that confusion.


> The key is not to put what you did in that first line, but why.

Can't we just say that the key is to put something that makes sense for the first line, given that sometimes only the first line is printed?

I don't really care if it says "The files must be in us-ascii" or "Changed the files to us-ascii"... both of them clearly tell me that the files were changed to us-ascii.


The difference is Chesterton's fence: when you encounter something seemingly pointless you should learn why it was there before you consider removing or changing it.


Ok, but when I read the first line of the commit, I am usually not considering removing or changing it. When I consider removing or changing the content of a commit, the least I can do is read the full commit description.


I think the disadvantage with this style of documentation is you can't really alter the commit message after it's written.

(I mean you could obviously with "rebase" but are you really going to alter something written one year ago, already merged to "main", and cause a bunch of pain with everyone's feature branch etc.?)

Compare that with documentation stored in a .md file, or even a Wiki or even Confluence. My colleague can write something and if I see a way to improve it I can go ahead and do that, and other colleagues can improve on what I've written.

In this particular case I suppose the bug is fixed and won't come up again. But I also myself find it tempting to describing the design of a particular component when I commit that component, and that's something I now avoid. What about when that component needs to be changed by a future commit e.g. due to the business requirements changing? Will the commit documentation just describe the differences? Then in order for a new team member to find out how the system works by reading the documentation they've got to read multiple commit messages and "merge" them in their head.


> I think the disadvantage with this style of documentation is you can't really alter the commit message after it's written.

That is not a disadvantage. The commit is a historical record, if I come back to that commit 3 years later I want to know its purpose in the context it was in, I don’t want a whitewashed history.

> Compare that with documentation stored in a .md file, or even a Wiki or even Confluence. My colleague can write something and if I see a way to improve it I can go ahead and do that, and other colleagues can improve on what I've written.

That’s like comparing a bicycle and a goose.

> But I also myself find it tempting to describing the design of a particular component when I commit that component, and that's something I now avoid.

That’s a shame. Knowing the considerations (or lack thereof) and tradeoffs at time of creation are often useful to understand defects, either in the original, or in evolutions, or in changes of use case.

> Will the commit documentation just describe the differences?

Yeees?

> Then in order for a new team member to find out how the system works by reading the documentation they've got to read multiple commit messages and "merge" them in their head.

No, for that you maintain a separate “current” documentation, which does not need to cover implementation tradeoffs, or that the original was written under time crunch, or whatever.


> That is not a disadvantage. The commit is a historical record

OP's point is that, while commit message is indeed a historical record, documentation isn't (or shouldn't).

If you double commit message as documentation, it would cause issues like wrong information confusing or misleading future readers because it's non-editable.


Crucially, however, the commit message is not documentation of the code, which would need to be changed and updated. Instead, it is documentation of the change, describing the reason for the change, what the code does to achieve that, and, if relevant, why you chose that solution. It provides necessary context to the already immutable diff and therefore need not be mutable itself.


I really love documentation that lives in the same repo with the code. My favorite is a .md file for every module, class or component. Some mixture of inline code docs and standalone docs is probably ideal. But docs as markdown that don't require some compile step to build the documentation, and doesn't require opening a browser to view them, is just so much better, IMO, compared to any sort of external docs like a wiki or html on a server somewhere that gets re-generated by a CI job.


If you put docs in a markdown file, you will still be able to see what the markdown said at that time because it will also be in the commit history.


Sucks when you mess up in your commit message though and don't type the right thing.


I think the non-editable nature of commit messages is precisely the benefit though. Yes, you can't really modify them post-hoc, but being able to step through a code base's history can be really illuminating.


A commit message isn't documentation that should be updated as things evolve. It's a historical record of a single change. Sure, if you later realize you forgot to put an important detail there, that's a shame. But overall I think it's actually important that they can never change.


I really think git made a mistake in conflating the immutable log of what was changed with the (ideally mutable) story of what got merged in. So you see people arguing over squashing commits vs rebasing vs merging. Squashing commits makes the history of commits a better story of features being added. Merging preserves the immutable log of the actual changes made to the code, and rebasing sort of does a bit of both.

But, I don't see any reason we can't have our cake and eat it too. We're programming computers after all and we can make them do whatever we like.

If I wrote my own git, I think I'd split commits into those two parts. I'd leave the history of changes immutable - probably with some sort of Merkle DAG like Git does. And then have a separate associated data store which stores the commit messages, in a nice sensible, editable log describing the work that actually happened. Let people arrange and rearrange the commit descriptors however they like. If you want, group commits around feature tags, fix typos and make any changes to the messages that you want. But, the whole while the underlying log of diffs ("what actually changed in the code") can remain (gloriously) unaffected.


> I really think git made a mistake in conflating the immutable log of what was changed with the (ideally mutable) story of what got merged in. So you see people arguing over squashing commits vs rebasing vs merging.

Every team I've been on struggled with this over and over and over. The tools are so hard to use it's tempting to make the version control process facilitate "git log" instead of the other way around, which is just absolutely insane. Obviously my co-workers should learn to use their damn tools like professionals, something something a poor craftsman, but honestly? This time the tools really are to blame.


Fossil has something a bit like that.


Can you elaborate how?


You can add text - essentially a wiki page - to a commit and edit it any time.

Additionally, you can also add tech notes, which are wiki-like entries in the timeline alongside commits.


> you can't really alter the commit message after it's written

You can append with git notes, though on a message that long I expect they're unlikely to be noticed.


Commit messages aren't a replacement for source documentation. The latter contains information relevant to the tree. Commit messages are transient information (historical info as someone put it). For example, an update caused by outdated dependency. Or the tests done to diagnose a bug.


I have seldom run into this being a problem.

The context of a commit message is that someone took some minutes to explain what the context of the change is. Using their current understanding. Explain the problem. Lay out the assumptions. Given three paragraphs or so it will help immensely to figure out how or why something you/them thought was the case was in fact wrong when the message was written.

That is documentation in itself.

And if you make straightforward mistakes like a typo in an issue key in the message and you really care: you can make a note of it on the commit with git notes.

> Compare that with documentation stored in a .md file, or even a Wiki or even Confluence.

I don’t want to access a remote wiki for every little code context (certainly not Conf.). The code is just right there. Comments/Doc comments/commit messages are mostly enough for that.


I know this isn't a great solution, but GitHub does let you write comments on individual commits. You could add whatever addendums you want there.


I think commit messages are mostly valuable for a future code reader asking "why is this bit like this?" and then looking at blame logs for the answer. As you point out, bigger picture stuff ought to be elsewhere (documentation, tracking bug).

Keeping docs in version control and including doc changes with the code changes is a nice way to address your concern.


There's no reason this documentation can't be replicated in another context, and for all we know it was.


One thing I disagree with is:

> I wouldn’t expect all commits (especially ones of this size) to have this level of detail.

(emphasis added) - actually in my experience it's often the little ones, innocuous looking things that might really need a relatively longer explanation.

Yesterday I wrote three paragraphs on why I added `--limit=999` to a `gh pr list` because it's confusing: there's already a `limit(` in the `--jq` argument, and the higher it is (given say infinite PRs in total) the lower the end result will actually be. (Yes I wrote a comment too. And probably spent even longer thinking about and working it up than writing about it; hopefully I'll recall it as an example the next time someone implies the job is about churning out code!)


I agree with you that the little innocuous things often need a longer explanation, but the linked commit message is way too long IMO. It either wastes the readers' time, or it causes the readers' eyes to gloss over at the wall of text. You don't need to document your entire journey in order to document your findings and explain why.

> This was a non-ascii whitespace character that caused `ArgumentError: invalid byte sequence in US-ASCII` when running `bundle exec rake`

^ should be sufficient. It includes enough keywords to come up in a search if someone has a similar problem in the future, it contains the root cause of the problem, and it is short enough that people are unlikely to gloss over it.


Yes, it's not my preferred style either, but it's much better than 'fixes error' type thing, subject line only, that's so common.

I like the form:

    Fix ArgumentError 'invalid byte sequence'

    Non-ASCII whitespace characters cause [...]. This was apparent in [...] because [...].

    This commit fixes the issue by removing the offending character; so the file is now solely ASCII characters.
Or that sort of thing. Subject tells me why, body tells me what the problem was and how it was fixed. (Who, when, where are already in the commit metadata! The diff shows a very literal 'what' too, the what/how in the body should offer context and explanation as required.)


The article explains why all the rest is, maybe not needed, but good to have.


For a commit that adds a language binding (and might be 100+ additions/deletions) I might just say “Add X function”. Because I’m just following established patterns. But for the linked kind of change? Yeah, several paragraphs of explanation is definitely useful.


I find myself commenting code in a similar pattern: A small kernel of "interesting" code that has a 1:1 ratio (or higher) of comments to code, which enables the rest of the codebase to be "boring" self-documenting boilerplate-y code that doesn’t really warrant much in the way of commenting.


It's not a great git commit.

1) For all that text, the first line "Convert template to US-ASCII to fix error" - could be better. Maybe a couple of extra words to state what whitespace character caused the error, and what the error was. That comment plus the diff is all the context you need.

2) Honestly, everything else is kind of pointless. It doesn't hurt, but there's not a lot of value here. The author documented their journey in tracking this bug .. who cares?


People who like to learn and improve as programmers do care. In fact, the article explains the value of all that additional stuff which implies there are those who DO care.

The article even provides a link to a search result showing multiple commits from people who learned from the fix.

That commit message is a treasure trove of knowledge.


Outside of the small caveat that his first line could be better (which is what all future engineers will read while scanning commit messages), like I said, at worst, it doesn't hurt.

I like this level of detail, whether it is at the commit, PR, or ticket level. If one of my guys did this same write-up for this same problem, especially one of my junior guys, I would have patted them on the back and told them they did a great job - because you wouldn't want to discourage them from doing more of this kind of write-up in the future.

But here, we can be a little bit more honest, and the truth is, that the problem he solved was trivial, so this kind of detail is overkill for that problem. Once find that the config file has unprintable non-ascii character, immediately you know most parsers would blow chunks on that - and there is only one fix - remove the problem character. So succinctly tell me the error you saw, tell me the character, and if you know tell me HOW it got in there (which is probably the most important detail that isn't in the write-up so this could be prevented in the future) - and that's enough because if in the future another engineer does a ticket/commit search for this error in our bug tracker, hopefully these details will show up immediately.


For great commit messages, just browse the git history of the Linux kernel where this is the standard.

The first line always mentions the subsystem affected by the change, followed by a one-line imperative-mood summary of the change. Subsequently, three questions are answered in as much detail as possible:

1. What is the current behaviour? 2. What led to this change? 3. What is the new behaviour after applying this change?

Example:

"Currently, code does X. When running test case T, unexpected behaviour U was observed. This is because of reason R. Fix this by doing F."


I was told by a recent contributor that my approach (i.e. requirement) to git messages is "unique". Apparently my Linux kernel background is showing, but all my commit messages look like the one shown here!

If about existing code, the comment belongs with the code. If it's a process thing (e.g. code that is removed or didn't work), it belongs in the commit.

Most importantly, while commit messages can reference issues for convenience, they MUST reproduce the critical details: GitHub is transient, git messages are not!


Here is a context-full commit message.[1]

This is so common that the maintainer wrote this[2]

[1] https://github.com/git/git/commit/d70f554cdf38b0b05cfaa8e8eb...

[2] https://lore.kernel.org/git/xmqqedevo8ps.fsf@gitster.g/


I used to write really long, essay style commit messages like this one.

Then a friend pointed out that I was effectively writing documentation and hiding it in commit messages.

Instead, I switched a lot of that effort to updating actual documentation (in a docs/ folder) that was relevant to the commit - so the commit would still have the information in it, it's just it was in an actual file and not just the commit message.

I also make sure my commits almost always link to an issue thread, as that's a great place to put all kinds of extra context around the commit that can be updated independently of the commit itself.


Git commit message aside, the described debug session raises a lot of questions about the crappy tooling developers rely on.

"ArgumentError: Invalid byte sequence in US-ASCII" is a terrible, hard-to-action error message. What file? What line? What byte sequence? This "let's give the user another problem to solve" style of error messages is pervasive in our tools.

Also, why does the tool even require US-ASCII as input in the first place? Are we still living in 1995?

Also, if only ASCII characters are allowed, why does the code editing tooling allow non-breaking spaces in source code? Is there a good reason for having such a character in this file? This problem could have been avoided if the editor could have been smarter or highlighted the "bad" character better.

This developer lost an hour of his life because of a cascading chain of defective tools.


This is the reason I dislike automatic squashing branches with rebase. Squashing discourages thoughtful and meaningful commit messages. What is the point of making a meaningful commit message for some specific change when it is just going to all be smashed together as a single commit on merge. I feel like rebasing is something that should be intentional to clean things up by the dev, but not as a default pattern on merge.


Squashing is very useful....for local development.

The idea of squashing already-pushed commits frightens me, glad I've never had to deal with it. Where are they doing this?


Github and Gitlab allow merge/pull requests to automatically squash on merge. Some teams set the squash merge strategy. My team used to squash by default, but I helped convince them otherwise.


I would just go with "Remove non-breaking space characters" instead of writing a Russian novel.

Also, if you're on macOS just use a Karabiner rule [0] that converts all non-breaking space characters to regular space characters to prevent yourself from accidentally typing it out.

[0] https://ke-complex-modifications.pqrs.org/#nonbreaking_space


There are also pre-commit hooks to remove the character, even one listed on the pre-commit site. This type of hook has saved me on many occasions, and would have saved my coworkers if they used pre-commit.

https://pre-commit.com/hooks.html


Great commit indeed. lots of context information. that's gold.

The worst I've seen are dozens of tiny commits pushed to the master branch directly. If you want to find out what took to implement a feature, good luck.

I'm a fan of tiny commits during code review but afterwards I prefer to squash everything in a functionally relevant commit. It makes git archeology much easier.


I put less importance on commit messages being thorough, though I do admire when people write detailed information in the body. What's more important to me is to have good commit hygiene. It's something the industry is also generally terrible at, but has slightly more immediate value. For example, if your PRs have clean, atomic commits that can stand on their own, I can "rescue" chunks of useful functionality from review hell by cherry picking them out. I do this several times a month to help my teammates burn down huge PRs or take good ideas out of doomed branches.


Agreed. I'll settle for good commit hygiene over good commit messages.

It often seems a tough ask to have both


I had a terrible time when someone used "smart quotes" (beautified Office quotation marks) in a configuration file. I believe this was only possible because they copied it from Outlook.


>smart quotes

I never understood why a "stylistic" choice requires separate characters. If we don't need a serif and non-serif version of every character and instead leave it to the software, why can't we do the same with the "smart" quotes?


Typographic quotes are left- and right-handed, vs. the ASCII double quote which is just a single character:

   “quoted”

   "quoted"
Who in the blazing highs of techno-utopianism fervor thought it was a good idea to automatically translate the latter to the former we'll never know.


Why not stylize it as typographic but store it as a "regular" double quote though?


One problem with that is that the correct start/end quotes to use depend on language which is not always know for all text snippets.


How do you handle non-paired ones then?


Render them as standard double quotes? The same way Markdown renders a single backtick as just a backtick, but text surrounded by backticks becomes code.


Nice try, but the problem with this is that typography is _really_ complex. For example, there is a rule in English typography (I'm not sure if it's often used today though) that when you have a quotation spanning several paragraphs, you should put an _opening_ quote at the beginning of each paragraph – but only one _closing_ one at the end of the quote.


We've let go of other historical typographic and spelling conventions, it's time to let go of that one too.


Computers exist to represent, transmit, and display _human_ communications. If a computer system cannot represent a human utterance we should extend the representational capacity of the computer system, not force the human to conform to the limitations of the technology.


Computers also generally don't support scribbles in the margins of your document. Or stylized glyph variants except a few that have been grandfathered in, e.g. ꙮ. Neither does the computer support whatever ligatures you might make up on the spot.

One thing that makes computers more powerful than analog tools is that they are much much much more structured.

So while we should not take computer limits for granted I think that the opposite extreme is just as if not even more absurd.


I'm also all for eliminating overly complicated typographical conventions from places where they are technologically possible.


It would be quite sad. Besides, it is not an arbitrary convention, it serves a purpose.


No it doesn't. Unbalanced quotation marks are a crime against humanity.


I would base direction on adjacent whitespace/punctuation instead of trying to do pairing.


Nice try, too. How would you handle Spanish quotation marks at the beginning of the sentence, then? And what about French quotes which are separated from their contents by a thin space? All of that is possible, of course, with large tables of special cases.

Or, you know, you could just have separate characters for opening/closing quotation marks. ;-)


> All of that is possible, of course, with large tables of special cases.

Yes, that's how ligatures work. Pairing across characters is the one I'm not sure I've heard of before in fonts.

Edit: Besides, how do you think these are typed on keyboards without separate keys? The software already detects and replaces them based on context.


I'm french and I actually really like the auto-translation. This way software that does not care about which quote those are (mail, web, etc) can swap them, and where it matters then it does not translates it (vim, etc).

Sadly the new official french azerty keyboard has dedicated keys for both opening and closing quotes, and the good ol' simple quote tucked away behind modifier keys. As a dev I hate it. (arguably I should not even use azerty for development but that's another issue)


> I never understood why a "stylistic" choice requires separate characters.

I don’t think it’s a purely stylistic choice, there is actually semantics to it:

    “ opening quotation mark, i.e.: starts the quote

    ” closing quotation mark, i.e.: ends the quote
You could otherwise make the same point about parenthesis: why not just |do this| instead (of this)?


Double quotes can be unambiguously autodetected by checking word boundaries, but separate single quotes are needed since apostrophe is the same character as a close quote, but can appear at the start of a word. Note that most smart-quote-generating software does fail badly at this.

Some common ambiguous pairs:

’bout - abbreviation for "about"

‘bout - starting a quote about a round of a fight

’cause - abbreviation for "because"

‘cause - starting a quote about a reason or ideal

’em - abbreviation for "them"

‘em - starting a quote about typographical units

’n’ - abbreviation for "and"

‘n’ - speaking of the letter itself


It's not a stylistic choice. Opening quotes and closing quotes are different things, and it isn't possible to tell the difference (when not already provided) without parsing the language in which they're used. That's why in TeX you have to manually specify which kind you want, and in software like Wordpress that just guesses, the guess is usually wrong and your published text looks ridiculous.


The complaint was purely about software replacing normal quotes with "smart quotes", not that it did it wrong.


But the question was why there's more than one code point between normal quotes, left quotes, and right quotes, and the answer is that it isn't possible for left and right quotes to share a code point.


"quote unification'':

  most typefaces: “Hello” „Hallo“
Verdana: “Hello” „Hallo“


Yeah, I've been bitten by those quotes in the past too. I noticed recently that VSCode (probably other IDEs too) highlight these characters pretty clearly to help avoid these issues.


I wonder how much infighting there was between orgs at Microsoft over this. with an Outlook/Word PM escalating... "Make the languages understand smart quotes!"


In other words real quotes that people use in published writing.


For a long time there was a Perl script called the Demoronizer that fixed this kind of nonsense.


I think this pops up in MacOS shells too


IIRC TextEdit.app has options like smart quotes, auto capitalize, and spell check turned on (in addition to being rich text by default), so you have to change all those to be a dumb plain text editor.


TextEdit and the Notes app have both caused me to copy/paste the specialized quotation marks.

Now I only use vim as a scratchpad because I've been bitten too much by GUI apps


I'm not at all a fan of this commit message. The summary line is vague (what template? what error?) and then the body spends 250 words explaining all the steps it took to get to this fix. What is this, a recipe on the web?

The commit message should explain the change being made, what impact it will have, and why it is being made. The audience is the developers reviewing the change or someone looking through the logs to determine why this line changed.


Agreed, this isn't a good commit message at all. Yes you should write detailed commit messages but this one is just verbose without adding much useful info. We don't need to know the exact command the original developer used to figure out the problem and we defenitley don't need to see his shell prompt including host name and all.



Thanks! Macroexpanded:

My favourite Git commit (2019) - https://news.ycombinator.com/item?id=22519632 - March 2020 (67 comments)

My Favourite Git Commit - https://news.ycombinator.com/item?id=21289827 - Oct 2019 (370 comments)


Essentially zero people read complex commit messages.

Do with this information what you will.

9/10 the code already is documentation enough for what the code currently does, if you need to go back through history then look at the commits. The messages are generally noise.

I've literally never cared _why_ someone made a change, I can see the change, I can see the effect of the old and new code. Rarely, if ever, has the thought process ever changed how I will interact with the code in question.

If I am at the level of debugging or history spelunking that the _commit message_ is the thing that saves me - I've already lost and there are other glaring organizational or design issues that are the actual problem.


> Essentially zero people read complex commit messages

I don't think that's true. I worked in support doing break/fix and outage response work at a large organization. That means constantly dipping into codebases I'm utterly unfamiliar with. Often there is complexity, un-obvious elements, previous incorrect attempts at a bugfix and so on, where understanding what the author intended can save literal hours of examination, experimentation etc.

> If I am at the level of debugging or history spelunking that the _commit message_ is the thing that saves me - I've already lost and there are other glaring organizational or design issues that are the actual problem.

This is kind of understanding what I mean, but there was no organizational or design failing here. This is just the nature of some work, I believe.


> [..] where understanding what the author intended can save literal hours of examination, experimentation etc.

The problem with that it that you are relying on an inherently unreliable source of information - a human to enter details which may or may not lead you to the correct path.

The code doesn't "lie". Just read it and the current issue and work from there.


The code may tell you what it does, but that doesn’t necessarily tell you why it’s there.

Especially for non-obvious pieces of code I have to deal with, I certainly prefer to understand the original reasoning and context within 5 minutes by looking at the original commit / PR, than having to spend multiple hours rediscovering that one quirky edge-case scenario that someone else already dealt with 3 years ago.


The code might not lie, but is also does not tell you why the DNS record in your IaC was changed from 1.2.3.4 to 4.3.2.1...

Well written commit messages are useful. Some people use issue tracking system instead, but commit message should have the information in more concentrated form instead of spread over 50+ messages of discussion.


Code doesn't lie, but it isn't always obvious either. A diff that fixes a subtle corner case is very difficult to understand without explanation. A function to fix dirty external data cannot be understood without reference to what the author is fixing. It may not be clear why a certain performance trade-off is preferable. A mistake may be hard to detect if you don't know what was intended.


There should be a test along with that edge case then. In same commit that fixed the edge case. Possibly updates to documentation, change log etc.

Commits are a purely for developers, excluding the rest of team.


> The code doesn't "lie". Just read it

No thanks. Some diffs aren't obvious. I'm not a mind reader. Having extra context is useful.


Even if it's true that essentially zero people read complex commit messages, that one person who needs to read it, 2 years later, will really appreciate its existence.

To me a commit message isn't there so someone can sit around and read commit messages to tell themselves a story. It's there in the -- hopefully unlikely -- possibility that there's a problem with the change, and someone needs to come back to it later, with no context, and understand why it was made so they can figure out what needs to be done.

I've absolutely used commit messages when debugging, and I don't think there's anything wrong with that.


I try to write useful commit messages. Sometimes they're as expansive as this example, but not always. On GitHub, at least the way my team uses it, the PR is the more visible unit of code change. If you have a single-commit PR, GitHub will automatically make your commit message the PR description, which is nice. It does not do that if you have more than one commit, in which case I write a general overview of the changes and write "See individual commit messages for more detail" in bold.


I'm surprised the author (of the git commit) put that much effort into the message, but did not mention what exactly that character is (its Unicode code point).


At the DLF, our pull requests are usually accompanied by a link to the bugzilla entry, which usually have a detailed explanation.

P.S. Having multiple Unicode values that exhibit identically when displayed are a huge veer-into-the-ditch mistake. I.e. the notion that code points should have semantic value is simply wrong.


> our pull requests are usually accompanied by a link to the bugzilla entry, which usually have a detailed explanation.

This practice annoys me quite a bit, actually. Well, if there is a bug tracker issue, of course, definitely link to it in the commit message. But that should provide extra, optional information; everything I reasonably need to understand the change should still be in the commit message. I don't want to have to chase down the information in an external system, a system which may not even be running anymore... and of course we all know that when systems get shut down, of course they get backed up and archived and made accessible properly every time... right.


I see your point. We are migrating to using github issues for that and related reasons.


That just means you are now relying on someone elses computer staying around, with URLs that you cannot control in the future if you need to. It doesn't actually solve the problem.

It also means that the explanations are not seachable locally.


> P.S. Having multiple Unicode values that exhibit identically when displayed are a huge veer-into-the-ditch mistake. I.e. the notion that code points should have semantic value is simply wrong.

Should a cyrillic `а` and a latin `a` have the same code point?

If they did then there's no consistent way to group those two alphabets, one or the other would end up with letters outside it's main grouping.

And what happens if the shapes of those two characters divege over time? Do you have a breaking change to introduce a new code point for one of the languages, or do you just make it impossible to have a font that can be used for both latin and cyrillic at the same time?

This isn't just a theoretical problem, by the way. There are characters in Chinese and Japanese that share a code point but the shapes of the characters aren't the same in the two languages.


> Should a cyrillic `а` and a latin `a` have the same code point?

Yes. Consider a book. Can you tell if it's a cyrillic or a latin `a`? Of course you can, because of the context. Unicode is about visible text, having hidden semantic meaning makes it something else.

Besides, 'a' can have all kinds of semantic meanings - all depending on the context in which they are used. There is no way to encode all this into Unicode.

> If they did then there's no consistent way to group those two alphabets

Doesn't matter.

> And what happens if the shapes of those two characters divege over time?

If it actually looks different, then it becomes a different code point.

> There are characters in Chinese and Japanese that share a code point but the shapes of the characters aren't the same in the two languages

More evidence that the Unicode committee lost its way.

Think of it this way. Printing Unicode text on a piece of paper, and then OCRing it back into Unicode, should be a lossless operation. Or another way - anyone should be able to tell what the code point value is by looking at the visual representation of it.


How about an l and an I? Or a closed "a" and one with the little handle? A zero with a stroke, a zero without a stroke, and an O?

I can see where you're coming from, but deduplicating every glyph from every culture based on which do or don't generally look the same when printed sounds like a tall order. And if all you want to record is the shape, you can use a PNG with OCR and bypass Unicode entirely.


> How about an l and an I? Or a closed "a" and one with the little handle? A zero with a stroke, a zero without a stroke, and an O?

Those are font differences, not character differences. (Unicode has also failed by adding in some fonts. The nuttiness never ends.)


But just as the “a” in alphabet and the «а» in азбука are only distinguishable by context, so too the “O” in SOS and the 0 in 90210 are often only distinguishable by context. I don’t see how you propose to determine which glyphs should be the same and which should be different in your system.


> so too the “O” in SOS and the 0 in 90210 are often only distinguishable by context

Only if it's a bad font.


> If it actually looks different, then it becomes a different code point.

Which language gets the original code point and which one needs to change?

And what happens to all of the existing text written with the original code point?


Obviously we are stuck with Unicode as-is now, warts and all. But that doesn't mean that the original design was the right one.


How does your scheme handle Turkish "Dotless I"?

Should there be different codepoints for serif and sans-serif versions of the same letter? After all, in some typefaces, "uppercase I" and "lowercase L" look just about the same.


I think you mean fonts. No, Unicode should not encode fonts.

Nor should it encode italic, boldface, underline, line out, reverse video, point size, superscript, subscript, or colored. Those are all style attributes, best applied with a style sheet, not a code point.


> superscript, subscript

Very strongly disagree. Copy-pasting 10<sup>6</sup> turns into 106 rather than 10⁶.


Depends where you copy and paste it.

Encoding superscript variants is a fools errand because you really need an exponential number of variants for ALL characters to fully support possible combinations of styles. The only reason (a very limited selection of) superscript variants are in Unicode at all is for backwards compatibility with mistakes made by earlier encodings.


It's true that giving a little potted history like this is "good" (other than he should have made a nice informative first line for summaries)

BUT it's not super useful to say this is good, the hard part is knowing WHEN to put this much effort in and when you can skip it.

I have many instances where I could do a longer story like this but it would be exhausting to do it every time. I try to do it when the commit might look unclear in intent or effect to an outsider, when the change is being made for an important reason, /and/ where this a potentially negative consequence (like naively reverting or writing bad code) if the change is not explained.

I think this is a decent example of that but not great, because no one is intentionally going to go in and start introducing nonbreaking spaces.


> BUT it's not super useful to say this is good, the hard part is knowing WHEN to put this much effort in and when you can skip it.

I learned git in the context of Linux kernel development (its original use case). In that context, the commit message is where you explain, to whoever is reading, what your change does and why it should be accepted. The more subtle the change is, the longer and more detailed the explanation must be to convince a reviewer.

So basically, the rule of thumb would be: pretend you're going to email your commit to another developer, who has the power to accept or reject your change, and who is not going to consider anything outside that email in their decision. The more subtle your change is, the more detailed and convincing its explanation has to be.


A complementary virtue is that the commit is tightly scoped to exactly one change. I still see most engineers commit whatever they had in their working directory as a sort of blanket Save Point, without any thought to how those changes can be captured as individual commits that can be commented and reviewed on their own merits.

This will typically also involve completely unnecessary changes, because when you're merging unrelated changes anyway, the unnecessary ones are swept up in the noise. At best they complicate rebases for other contributors, but too often they also cause outright regressions.

It goes without saying that the commit messages are a write-off at this point, because even if they felt motivated to take the time to comment it clearly, the change is so messy and nebulous that it becomes hard to comment on. If their code gets reviewed it's more likely the reviewer gives up and stamps it so they no longer have to look at it.

Most people still don't seem to know that `git add -p`, `git reset HEAD`, `git stash`, `git rebase --interactive`, etc. are even available. They never learn what git is capable of, so they act like version control is a bureaucratic obligation rather than the peerless superpower that it can be. The problems they cause don't end at their terminal though, because now they've made a mess of the repository for every other contributor as well.


> git add -p

That command is a UX monstrosity and don't ever prompt any junior to use it. Use magit, use jj, use your IDE, just not that.

I agree with your overall sentiment however. I try to measure up commits in anticipation of future revert(s). A commit is a minimal change that can still (1) pass UT (2) become a sensible revert one day. That's my measuring stick.


Big fan of the straussian commit message/commentary style. You read it once and think you understand, but you come back to it much later and understand it in a second, deeper way. There’s an art to this that some people seem to have. Maybe it correlates to taste.


I have a different opinion about favourite Git commit messages.

I think commits should be small steps that display the thought process of the author. Every individual commit should be self-explanatory. So the commit message should not describe (again) what the changes are but why it’s necessary. Sometimes the change is not self-explanatory and then I'd put a longer description below.

Somehow I came up with this on my own, so I'd be interested if it really makes sense or if others have a similar style.


I just don't think this is feasible, and I think you're maybe over-estimating the diff-reading skills of others. I agree that changes should be as small and self-contained as possible, but I don't think it's reasonable to expect a later diff-reader to be able to understand what the change is just by looking at it.

Certainly a commit message should include the why, but it should include -- start with, really -- the what as well.


Not completely on topic (if you read TFA) but my favorite Git commit is by compiler badass and HN frequenter, where he checks in an entire C compiler to the D language repo:

https://github.com/dlang/dmd/pull/12507

https://news.ycombinator.com/item?id=27102584


This post has a similar tune:

How to Write a Git Commit Message / The seven rules of a great Git commit message https://cbea.ms/git-commit/#seven-rules


When a developer admits that a single wrong character cost them one hour, it probably took 3 hours.


A good commit message must explain the reason of the commit (i.e. fix nasty char dncoding issue). This commit is nice but far too long in my humble opinion (and I like to write!) The what is already in the commit diff. Explain the why, trust me.


It would be cool to have a tool to add this stuff to a logfile, that gets auto dumped to the next commit message.

Something like a vs code command that would add to the message for the commit I'm working on.


Just an aside: is there a vim syntax command to highlight weird unicode whitespace as an error?

Something like:

        syn match unicodeWhitespace /[list of unicode whitespace]/
        hi def link unicodeWhitespace Error


I wouldn't know about the Vim equivalent, but I wrote this in my Emacs config long ago:

    (defun find-non-ascii ()
      "Find and show all of the non-ascii characters."
      (interactive)
      (occur "[[:nonascii:]]"))
It has proved quite useful on occasion.

(Yes, I could just invoke Occur directly, but I can never quite remember the regexp character class and syntax here.)


No worries. I managed to whip something up:

        autocmd BufWinEnter * match ErrorMsg /[\xa0\u1680\u2000-\u200a\u202f\u205f\u3000]/


For non-breaking spaces specifically, there is :help 'list'.


I guess it makes sense that it's a common name, but I was expecting this David Thomson[0] instead :-)

[0] https://dthompson.us/


I would prefer to channel that energy that went into writing this lengthy description into actually fixing the toolchain to at least fail with more actionable error message.


I do appreciate them but they’re just a pain to write sometimes


In my opinion one of the most important features of a commit message is that it fits in a single line so it can be read in git log.


I stated using gofakeit's "hackerphrase" for all commit messages.

https://github.com/andrewarrow/feedback/commits/main/

hp | git commit -a -F -

hp is a golang binary that just spits out a hacker phrase. I have this aliased with the letter q for "quick" so I'm always checking in stuff with q return push done.


that's a great idea but you can make it even more efficient

  head -c50 </dev/urandom | xxd -p -u | git commit -a -F -
That way you'll never accidentally overflow the commit message first line length and no one will even begin to think that the messages might mean something!


this is great. If I do a mix of these + hacker phrase I'll be able to find a commit like, "oh yeah, it's the one right after the urandom bytes.


I've been using opencommit[1] and have been pretty happy with the summaries its generating. I usually add a one liner at the top to summarize the reason for the PR but it saves me a lot of typing documenting what was actually done at a high level.

[1] https://github.com/di-sukharev/opencommit


This is a nice example of Neves' Law: the harder the bug is to find, the smaller the diff will be.


I really hate that commit message. It is extremely verbose and doesn't allow you to easily understand what was done in the commit in a single sentence or paragraph. It mixes a very narrative explanation in there that is hard to skim. It is in desperate need of a clear TLDR version.

If you really like the descriptive, verbose message, then the most important description should be at the top and it should gradually go into less interesting details as you read down, like in a news article.


The uberdetailed exploration is worthy of a blog post, those are for stories! but a commit is way too obscure a spot to put it in, so it's just a waste of effort on the writer side, but also a waste of attention of the readers, a short message describing why a change in space resolved which bug where would be more efficient


It's a great commit message, but could be even better with "TL;DR non-breaking space in a comment can cause parser problems".


“Fix miss-encoded white space” is clearer to me.

No need for a very lengthy explanation.


My rule for any developer doc is: don't tell me what, tell me WHY


I agree commit messages are the most important form of documentation.

But I disagree about the format. I prefer commit messages like:

   JIRA-123 one-line 80-char-at-most description
   
   Long description if needed (but preferably keep it in JIRA).


> but preferably keep it in JIRA

Jira is an additional indirection to a tool you will (not might) eventually lose. I’ve seen commits which had lived through 3 VCS transitions.

Not only that, but a lot of information is often considered undesirable on tickets.


Deeply strange to me that you can think that they're the "most important form of documentation" and then propose to put the documentation elsewhere.


JIRA contains more context than you can put in even most elaborate comment. For example links to duplicates, test cases, screenshots, core dumps, discussion with testers/customers, etc. And it's updated over time.

Would you change the commit message if a tester noticed something about your explanation of the bug was wrong but the fix still works?

Additionally the context is already there in the JIRA, retyping it in detail in a different format into the commit message is more friction for little benefit.


Every country should have a GDS. They do great work.


My favorite git commit is "Bug fixes"


I'm about to release 1500+ commits. I hope people will be able to praise some of my commit messages like this one.


great commits are great. This is fantastic

As an aside, I'm tired of documenting:

- in code

- in commits

- in jira

- in confluence

- in daily standups

- in release notes


- in Sharepoint

- in some random Google Doc owned by one person that has to give permission to each individual who needs to read it, even though it's something the entire Engineering department should have access to.

- in a GitHub Wiki that nobody even realizes exists

- in a separate git repo that exists for just documentation, but nobody has checked anything into for 2 years


sometimes I wonder: If I didn't need to update 6 things, get approval from two colleagues, and justify the time to a scrum master watching how many tasks I complete every 2 weeks... would I write more meaningful documentation?


The code and commit documentation are about different things. One is about what this thing is, the other is about what is changing.

But then, jira is about what is changing, confluence is about what is changing, standups are about what is changing (oh, but this is doing them wrong), and release notes are about what is changing. So your complaint is completely reasonable, just the first item shouldn't be there.


You also get what is changing by doing "git diff", and often, it is self explanatory, especially if you have already commented in code why you did what you did.

For example:

  - code: MAX_SIZE=1024 // maximum size the backend supports
  - commit: limit the size to 1024, as it is the maximum the backend supports
  - jira: fixed the problem by limiting the size to 1024, as it is the maximum the backend supports
  - confluence: do not exceed 1024, as it is the maximum size the backend supports
  - standup: I set the maximum size to 1024, as it is the maximum the backend supports
  - release note: Maximum size set to 1024, as it is the maximum the backend supports
This is all redundant. I used a simple example for the sake of brevity, and one-liners are reasonable in every case, but the problem is when you are expected to write a wall of text every time, when in reality, there is everything you need in the code.


I'm generally wary of code comments. They tend to rot; people change the code around them but then forget to update the comments. They also take up valuable screen real estate. I'd mostly rather be reading code than comments, and I can see more context when there are fewer comments.

If a comment truly is necessary, it should explain why the code is doing something tricky, unusual, or unexpected.

In your particular example, that comment is superfluous; the constant should just be called MAX_SUPPORTED_SIZE, or something similar, that describes it better and makes the comment redundant and unnecessary.

Otherwise I agree; there's so much redundancy in our work, it's silly...


I am generally on the same opinion as you are. I dislike superfluous comments and documentation. In fact, I usually try not to read these as these can be deceiving.

But this is one of the few cases where I think comments are worthwhile. That is, as an out-of-band channel where you explain things that are not in the code base, that is not common knowledge to skilled programmers, and that is hard to express in other ways.

This is a simplified example, normally, one would say more than that. Something like "for historical reasons, the backend is configured with a size limit of 1024, larger messages may be dropped". This value may be the result of a conversation with someone from the backend team, and without this comment, a newcomer will have absolutely no idea about the why, even if his is a programming god. And there is no easy way to express that in code. "MAX_SUPPORTED_SIZE" without any context may be insufficient, as it may raise questions about why this limitation even though the software used for the backend shouldn't have this limit.

I don't think the value of comments is only on the trickiness of the code. Some code like bit twiddling hacks may be tricky, but these are the kind of things good programmers should know, or if they don't, learn. There are publicly available resources for that. But parts, even simple parts, that require domain knowledge, or worse, knowing specific people in the company definitely benefit from comments if code can't express it.


As someone who's been 'the next guy' to people who share your opinion: I really disagree with your opinion


I dont think Ive worked at a company that used commit messages to record what is changing.

Im not saying that I am disagreeing with you on that point. I’m just sharing my experience

We tend to rely on the diff and PR for the ‘what is changing’

The PR doesnt list the commit messages anyway, just the last commit message and the PR description

We add links to jira/confluence a lot in code to give direct access to the dev. Commits are tied to features or bugfixes

So effectively commit messages are not useful anymore

The only one that might be useful is the merge commit message

Again, not disagreeing, just complaining lol


- in code: code-level context

- in commits: changeset-level context

- in jira: if you need to provide changeset-level context, point them to the MR/commit log

- in confluence: high-level documentation

- in daily standups: status update; if you need to provide changeset-level context, point them to the MR/commit log

- in release notes: generate automatically from commit log

IMO duplication in documentation should be treated the same as in code – it should be avoided as much as possible.


Agreed except for

> - in release notes: generate automatically from commit log

Release notes are for a completely different audience and should be custom-tailored to that audience. They also shouldn't include changes that were reverted or made redundant by later changes in the same release.

Sure, commits-as-change-log is better than no change log but still very very far from optimal.


love those guys at the UK digital service!


Is it a trend now to write the blog post as the commit message?

I must be getting old because I hate this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: