Oh gosh, this 100 times. I comment everything, even rationale behind decisions, ...

throwawaylinux · on Sept 20, 2022

Changelogs are very good for that if you write good changesets. They don't go out of date either, because you know exactly what source they apply to.

baby · on Sept 20, 2022

I feel like nobody goes to look at changelogs, or PRs, or commits. Probably because they don't ever expect anything good from it. Also they're not really searchable.

jlokier · on Sept 20, 2022

It depends on the project. Linux kernel commits often have excellent messages, because that's the standard expected.

baby · on Sept 20, 2022

But still, how would you search through commit history to figure out one thing? Comments are right here in the code, and books/external doc/rfcs refer to concepts.

I feel like commits are only good if you're spelunking, which is usually for a single reason: you're bisecting looking for a bug.

throwawaylinux · on Sept 20, 2022

> But still, how would you search through commit history to figure out one thing?

git log and git blame, for example. It's right there with the code too.

"Why is the code written this way? ... oh yes it was changed in commit #blah because of this problem."

baby · on Sept 21, 2022

Never really worked for me. I just find the last refactor commit that moved or modified the line of code

throwawaylinux · on Sept 21, 2022

Then look at the prior commit.

baby · on Sept 21, 2022

anecdotal: https://twitter.com/cryptodavidw/status/1572621875091742720

jlokier · on Sept 22, 2022

> Comments are right here in the code,

If you write down in comments the history of why the code is written the way it is now, not just the current code but all the things that were tried before and why they had to be changed, you'll have too many comments and it will be hard to read the code. That's why it's rarely done.

> I feel like commits are only good if you're spelunking, which is usually for a single reason: you're bisecting looking for a bug.

I've read history to see why things are the way they are, but I agree that most of the time it's looking to see when a bug is introduced and what was known about it at the time. That's a pretty important use, though. If all you get is a commit with no useful message, you can see what line of code was changed but not what the reasoning and investigative data behind them was. Many bugs show up from changes that were themsleves supposed to fix bugs or make something subtle work in a particular way, so the reasoning behind them is relevant when a new bug is discovered.

With a good issue tracker, a commit message like "fixed #386" is theoretically enough because the information is in issue #386. But tbh it's still friction to see lists of commits which contain nothing more than # references to pages somewhere on GitHub and no useful description. I prefer to summarise the issue and the fix in the commit message (and PR message) in those cases.

(To an extent it depends on whether you're using Git itself, or GitHub/equivalent, as the latter expand # references to include the one-liner description when displaying the messages. I find GitHub extremely slow compared with Git, and it has awful commit history tools (won't show the graph for example), so I use Git and see # references by themselves. When colleagues produces a lot of these, it's like a sea of unexplained changes, as if nobody can be bothered to say what their code does at all.)

Another completely different reason I've grepped through git history with the Linux kernel and other widely used projects like Glibc and GCC, is to see every change to an API or subsystem or function throughout it's history, in order to write "portable" code that will work with every version across a large time range. Occasionally I've even written a short document listing every change that's relevant to what I'm building, to help me build the thing.

This is particularly important with system calls, library functions, and internal APIs (e.g. for kernel modules). Although it's rare for an external API change to break existing code (though it does happen), it's common for an API feature which works today to be missing or buggy in the past, in versions which are still being used by someone. Internal APIs change more often, so finding the changes is even more essential. Writing portable code means finding the history of all those changes, including bugs and feature additions, to write code that works correctly when it's running on any version.

For example when I was writing code to use io_uring, a large part of the work was going through every change to the kernel io_uring subsystem to check every change affecting those parts of the API I was using, so I could avoid using them on buggy kernel versions, and so I could adapt to API changes that occurred. (This was also useful for future-proofing the code in that my test environment wasn't able to run the latest kernel, but in examining the history I'd also see "future" changes that my code would need to work with when shipped.)

The explanatory commit messages were essential for that. There's no way I could have understood the purpose of relevant changes in a useful timescale without those messages. Particularly for things which affected performance or thread correctness in subtle ways only with some machines and some applications, that you simply could not see from the code.

You might argue that comments should be there to explain all non-obvious aspects of the current code, but for code like which contains thousands of "Chesterton's fences" at high density, that style would be very comment-heavy, and that style is generally discouraged. In effect, there's more to the code than meets the eye. At least with the Linux kernel, the culture evolved to expect explanatory Git commits (before Git it was the mailing list, go back far enough and there were more comments in the code), so everyone knows to look at Git and lists now, keeping the code itself relatively clean as a result.