I saw someone quip (on twitter, I think) many years ago something like:
"A junior engineer writes comments that explain what the code does. A mid-level engineer writes comments that explain why the code does what it does. A senior engineer writes comments that explain why the code isn't written in another way."
(except punchier, of course. I'm not doing the quip justice here)
> writes comments that explain why the code isn't written in another way."
Exactly! I have written code that required comments five times as long as the code itself to defend the code against well meaning refactoring by people who did not understand the code, the domain, or the care needed to handle large dynamic ranges.
I have also written substantial functions with no comments at all because it was possible and practical to name the function and all the variables so that the meaning was clear.
It's true! If you come across a well-documented and complex piece of code, it's always easier to delete it all and write a simpler (and less correct) replacement than reading all of that code and documentation. And since your replacement is simpler, you're free to delete the inapplicable documentation that you saved so much time not reading!
If you didn't back that documentation up with some tests to reflect the necessity of that complexity, this is an avoidable tragedy. But if the junior then comes in and deletes your tests because they don't pass, that's a firin'.
In the end we had sufficiently good tests to detect such foolishness. But even that isn't good enough because if the result of the test is merely slightly off some people will just adjust the assertions!
I do all of the above. Summary comments are incredibly helpful and have been validated by empirical research. Too bad many have drunk deeply from the Clean Code koolaid and can’t be saved.
I got some flak at a prior job for saying I had some quibbles with clean code (a few years after I had read it), and I'm glad this opinion is more popular today. There's so much cargo culting hype with "best practices" and style, I hate it. Same with how overly dogmatic people were with OOP paradigms when it came out (remember using anonymous interfaces to pass a function around in java?). Same with the functional backlash to that.
It's fun and enlightening to go ham on any particular style/framework/philosophy, but actually living by dogma in prod gets kinda dangerous and imo is counter to the role of a senior+ engineer
The very term "best practice" is such a loaded term. It implies:
1) empirical measurement compared to a large number of alternatives ... the empirical measurement or study is never mentioned: because they do not exist
2) the best practice is valid in all measurements and criteria of comparison: performance, elegance, simplicity, correctness
3) since there is no data, the reasons for why the practice was designated best are rarely even explained
4) nor are the circumstances or individual or source of "best practice" detailed
5) it will always be the best practice: it is the BEST! It CANNOT be improved.
So we have an unsubstantiated, unargued, unsourced, non-authoritative, exaggerated declaration in virtually every case of "best practice"
This is an example of taking something that is contextually good advise, applying it to all situations, which turns it into bad advise.
If you can make your code more clear, so that comments aren't necessary to explain what it does or how it works, that is probably (but not always!) something you should do. But that doesn't mean you shouldn't have comments. At the very least there should be comments explaining why (or why not) things were done a certain way.
This is fine. A problem arises when you assume that this will always be the case, for all developers, and then mandate that they omit comments, _without checking if your assumption is true for their case_.
IMHO that rule can be generalized: Whenever you make rules for other devs, make sure that the assumptions on which those rules are based are true, lest you interfere with their work in a negative way.
A line of code can tell you what it does but not why. Unless you are on a newish codebase, you will likely need comments to explain why certain decisions were made
well-named works great while you're writing the code. come back to it in a few years, or hand it over to somebody new, and you would realise that what looks like a good name to you means nothing to somebody else.
That's a horrible way to name a function. Function names should be short, punchy, and unambiguous. They should create a simplified abstract narrative, and all the details should be put into the docstring, so that they can be easily accessible without having to (a) be squashed into an identifier, or (b) be repeated every time you want to call the function.
There are indeed those situations where a comment would not increase the clarity of the code.
But one shouls be careful not to mentally think of this as a zero sum dichtomy, where you either have well named functions XOR you have comments, because in reality choosingn both is often the golden path to success.
The danger is of course that code that is totally obvious to you now will take far more time to become as obvious later, be it to your future self or to your psychopathic lunatic co-worker who knows where you live.
So very often code can be made more readable by adding comments, even if it is just saying the same thing with other words, just by reducing ambiguity. Comments can also bridge higher level concepts in a good way, e.g. by you explaining in a few lines how a component fits into a concept that is spread out over multiple files etc.
In the end code is text and like regular prosaic text you can both make it harder to understand by not mentioning things or by mentioning too many or the wrong things. This is why it is not irrelevant for programmers to be good and empathic communicators. Sure in the end readability doesn't matter to the computer, but it certainly matters to all people involved.
That is like saying: "A perfectly good road needs no road markings". The point of of good comments is that they make the code faster to read and less ambigous. While good code should indeed already be readable and unambiguos, I have rarely seen code that couldn't be made even easier to understand and faster to parse by writing the appropriate comment.
But of course you will have some individuals who think it is cooler not to, and they are probably the same people who think use after free bugs can be avoided by the shere willpower of the solo-male-genius that they are.
> I was specifically taught that good, readable code could explain itself; that it would make comments redundant.
Good readable code removes the need for comments about what the code does, if the working of the code needs extra explanation then perhaps it is being too clever or overly terse, but there are other classes of comment that the code simply explaining itself can't cover.
Some of my comments cover why the code does what it does, perhaps linking it to a bigger picture. That could be as simple as a link to work ticket(s) which contain (or link to) all the pertinent details, though I prefer to include a few words of explanation too in case the code is separated from whatever system those work items are logged in.
Many comments state why things were not done another way. This can be very similar to “why the code does what it does” but can be more helpful for someone (perhaps your future self) who comes along later thinking about refactoring. These can be negative notes (“considered doing X instead, but that wouldn't work because Y or interaction with Z” – if Y and Z become irrelevant that future coder can consider the alternative, if not you've saved them some time and/or aided their understanding of the bigger picture), helpful notes for future improvement (“X would be more efficient, but more complex and we don't have time to properly test the refactor ATM” or “X would be more efficient but for current use patterns the difference would be too small to warrant spending the time” – the “but” parts are not always stated as they are usually pretty obvious). A comment can also highlight what was intended as a temporary solution to mitigate external problems (“extra work here to account for X not being trapped by Y, consider removing this once that external problem is fixed” or “because X should accept Y from us but currently doesn't”).
Code explains the what but not the why. And even then, the what might not be so clearly obvious.
This is one of those blindspots devs have in that they believe their code to be good and obvious to everyone but in reality it is not even good and obvious to their future selves who will be the ones maintaining that code
IF the code is self-explanatory, then the comments are redundant, and is ok to delete them. But from time to time, there are things that are at least not so obvious in the code. Then is good to leave a comment.
That could be used to see how good a language is for specific tasks. If you need to write lots of comments, maybe you have the wrong language.
Perhaps professor was fed up with over-commenting (comments made up a large part of submitted code), especially if comments were like in https://news.ycombinator.com/item?id=41506466. Unless the course is "practical software engineering" or similar, that good programming practices are a focus, and if the why/why-not parts can contribute to better assessment, an associated paper can be asked.
When writing it inline, I like this approach. I like even better when these things have names. Something like `std.time.s_per_day` or `time_utils.s_per_day`. Then in the one place they're defined, use a pattern like the above to make them easy to reason about.
> I did a survey once in about 3/4 of the comments were either wrong or useless.
> Examples:
> //Add 1 to x
> x+=1;
If 3/4ths of comments are like this, maybe show a sampling of public source code (e.g. from github) that shows how prevalent comments like this are in any real codebase.
I've been programming since 1982 and have never seen this type of "add 1 to x" comment in real code, outside chapter 1 of some intro to programming book.
I would write that comment if it was a long enough list of single line var assignments with enough complex ones that each have a comment like this one.
I will als name each db table column in the correct order by its correct name right above that what decided the value.
For me the quality of comments, somewhat based on the metrics that @renhanxue mentions, is a code smell. If code is poorly commented (by my standards) then I treat the actual code with suspicion.
Yes I write comments like a maniac. Long doc strings that are informally written. It’s more important for me to say what I need someone else (or future me) to know about a function and its context than it is for me to have some beautiful, sterile 300-line autogenerated soulless docstrings.
I write detailed git commit messages like a maniac.
Git commit messages are better than comments, because they tag the specific baseline of code where a decision was made, and you can write multiple paragraphs to explain something (including all the "why not"), without cluttering the program text.
The problem with comments is that they also pertain to a revision that existed around the time they were written, but they stick around, pointing to newer revisions, perhaps falsely. They add clutter. Unless you use a folding editor, comments can separate pieces of code so that you see a smaller window of the program.
One line of code can be touched by many, many commits. Each of those commits should have something to say about that, and all that talk cannot possibly be put into a giant, ever-growing comment next to that line of code. In regard to my previous point, a lot of that talk won't even be relevant to the current version of that line!
I've taken the view that the thing I'm developing is a git repo, not the source tree.
A source tarball is just something for building and deploying, not for development.
If someone wants to understand why something was done, they must use the repo, and not a source tarball. If they insist on just working with the source snapshot, but ask questions that are answerable in the git history, I cannot support them.
I think this is fine if you have buy in within your team. Your concerns around temporarily are valid, though using git over comments is a trade off does make understanding a particular bit of code an exercise in reading backwards until you’ve captured the entire context. It also implies some level of fairly consistent discipline from participating members in terms of structuring commits.
That said, git commit messages as a meaningful source of information aren’t for me, I prefer reading the code as a single pane of information. We alleviate concerns about doc strings and comments becoming out of sync with the code by just reviewing comment salience as part of standard code review.
Either way, I think there’s no wrong answers, if it works for you and your team then that’s good.
It's a lot easier, in a code review, to ask for a detailed commit message that conforms to a certain format, than to say things like, "can you add comments?" Which line or change deserves a comment is subjective.
In my open source projects, I use GNU ChangeLog format in the log messages.
This mentions every file that is touched, and every defined entity (function, type, macro, variable, ...). In a code review, while it would still be difficult to enforce the quality of what is written about these things in the comment due to subjective reasons, there are objective aspects to it. When reviewing, you can point out something like "you changed the initializer of global variable g_foo, but it's not mentioned in the commit message".
Sure, API comment headers can be enforced reviews. But those are not the comments this discussion is about (and elsewhere I identified these as necessary comments that can't be farmed out into the git log messages).
Yes, we should all be kinder to both others and to future me (you). Do we know who will be maintaining this code?
Obi-Wan meme: Of course I know him. He's me!
I've never worked in a company where the commit log message wasn't just a link/reference to something in a bug tracker.
I feel like a 'what this block of code does' comment is different from 'what is this change, and why did I make it' commit message.
I don't write what the code does or even what "I understand the code to do". I explain choices, especially ones that the next developer or my future self is likely to misunderstand when looking at the code.
I would say that I have observed something like this about as many times as someone has changed some semantics in a way that a variable or function name is no longer correct. That is to say: probably a few times in 25 years, but nothing compared to how much value I have received from them.
It could be that too, but I think that presumes an order - that the comment was written after the code. If the comment was written before the code, then it would describe what the author was trying (intended) to do. Which also implies an order of course.
Yeah, absolutely check who wrote a comment before relying on it
It's luckily rare for people to add wrong comments, though (and those who do should be publicly fustigated).
By the way, please never state something as it were the truth if you're not sure that it is. Saying "I think" is perfectly fine, and might save people days of investigation.
This is what I tell people: if you're writing a one line code comment, write a debug log message instead.
There's vanishingly few cases where this extra logging statements will ever be a problem and they can all be handled autonomously if they ever are - but it will save everyone else a ton of time in deployment.
That is, supposing that it's set to saturate at 100 events and they all happen fast enough that it doesn't reset the count. In this case it'll log another batch of 100 on the 201st event.
I think it's important to log the first one immediately, just in case you want to alert on that message. It wouldn't do to wait for the other 99 before raising the alarm.
I wrap my loggers in deduplicators only when I'm about to hand them off to a loop. Otherwise it's the normal ones.
Do you routinely run production services at debug or trace log level?
The point is there's a big difference between "we've got a problem, we need to add logging and redeploy to try and isolate it" versus "we might have a problem, bump the logging level up on that service to see what's going on" (which with the right system you can do without even restarting).
This 'empirical research' is highly doubtful. The first question to ask is what the code with summary comments looked like. While summary comments can sometimes be helpful, this is mostly the case in functions that are relatively long. A question that always arises in that case is whether it is a good idea to split them instead of commenting.
We could have the best of both worlds if comments could be easily hidden, or better yet, just additional meta-data on rich text code. But nope, we can't get away from ascii.
Trouble with comments is that they drift from the code over time because most people do not update the comments - - based upon my surveying production code bases. If they are hidden, it will drift even quicker and become even more useless faster
I would urge developers to err on the side of too many comments over having too few comments, even if there's a risk of them going stale. I can deal with drifting comments, but I can't deal with missing comments.
This is caused by poor or nonexistent code review practices. Reviewers should be ensuring that related comments are updated if code functionality is changed.
i think in sean's proposed world we'd have metadata about that too! the comment in the context it was written in would be available, as well as all of the surrounding changes that potentially invalidate it. as well as potentially a whole discussion thread about what they meant when they wrote it, and suggestions about how to change it.
...and the main reason people don't like comments is because they clutter up the code that gives them the truth of the matter. But yes, if they aren't in your face forcing you to look at them rather than the code, then they are slightly more likely to not be ignored when the code is changed.
It would be nice if they could be like footnotes, or boxed out-takes, that could be pushed to side notes. We have had the typography, even if it was just markdown with a rendered code reading mode.
You don't need to abandon plain text to hide comments. Comments are detectable with a regex; an extension to hide comments would be trivial to make in most editors. I think it's not a common feature because people generally just don't want it.
Rich text code would create so many problems, too. You get locked into a special editor. You need special version control. Grepping becomes difficult. Diffs become difficult. You'd need a whole separate ecosystem, and for what? We have treesitter; we can already treat code like data.
Junior programmers tend to either document nothing, or document everything. With experience you realise that you just want to document the unusual stuff, and as you get more experienced, you realise there is less and less unusual stuff, so the amount of comments drop down.
So, less comments wins out, but faced with a code base without comments you have to inspect it to tell the difference between a beautifully crafted piece of brilliance or a totally flaky codebase written by a bunch of newbie hacks.
> and as you get more experienced, you realise there is less and less unusual stuff, so the amount of comments drop down.
This is a common experience when I use a new language or framework. At the start I comment a lot of things because I think it's useful for future me, but as I learn more I realize that many of the comments are redundant so I end up removing them.
You have to consider who the comments are for. Senior engineers comments are more useful for senior engineers, but the junior comments will be more useful for junior engineers.
I don't know. Senior engineers may understand the point of the code already, and "why it's not written a different way." The comment could be explaining to the less-experienced devs why they shouldn't waste time on another intuitive-at-first-glance approach.
There's also the comments that tell you why the code does something obviously stupid but needs to continue replicating this stupid behavior because something else depends on it behaving this way.
That sounds like "explain[ing] why the code isn't written in another way" where the other way is "the way you are right now thinking it should be as you read this: the way that isn't obviously stupid".
I'm thinking less in "way the code is written" and more "the way the code works".
That is things like cases it doesn't support, or weird results in some cases.
One example is a shop I worked many years ago that had a function which did time conversion on date times without adjusting the date portion. Obviously, objectively, wrong.
However the callers to this function expected this errant behavior. Actually changing it to behave correctly would require a coordinated change. Putting that comment in place prevents a new hire from going "oh this is obviously wrong!" and fixing it, causing an outage.
> A senior engineer writes comments that explain why the code isn't written in another way.
And C-level engineers write a comment "X hours have been wasted on refactoring this code. Should you decide following the example of your priors, please increment this counter."
Sometimes, even what appears to be an utter hack job actually is the best you're gonna get.
I think of comments as an apology to the engineer reading the code.
I'm apologising because the code isn't obvious, or the language not sufficiently expressive, or the good-idea-at-the-time no longer is. Ideally I wouldn't need to write many comments, but I often find myself sorry things are not simpler.
In the end a comment should be like a shortcut that allows people to understand your code faster or a reminder why certain choices were made.
You are right in spirit to say sorry when the code cannot be self-explainatory, but consider that sometimes even if the code is self explainatory, comments can help your reader to see the grand picture faster or to avoid interpersonal ambiguity by allowing you to use two different ways of phrasing a thing.
I find that purpose of comments is to give context to the written code. Sometimes the context explains what the code does, and more often it explains why, but the best comments gives me the reader an insight so that I can have a intuitive feeling for what the author are writing.
Well, so, does this method return the login name of the user ? Or is the "name" field in fact a company convention to return firstname and lastname concatenation ? Oh and if it’s that, what will be the concatenation order ? It doesn’t seems like it’s a parameter but is it configurable somewhere ? Is this automatically defined somehow ? Is it hardcoded ? Or is there a name field in the database and if yes, what does it represent ? Etc …
Within the code you'd have `return this.user.firstName + ' ' + this.user.lastName;` or similar, which can explain the details of how it works when you look into it.
I'm not super opposed to that, honestly. Doc comments are coloured differently from code, and really help you scan through a file quickly to find the next function. It honestly doesn't take more than a couple seconds to write, and (I find) it really helps readability enough to be worth it.
I resonate with that example. It's usually about a piece of code that didn't turn out the way you wanted to due to an issue. If there's a literal github issue or source I can link, I'll include the URL in case someone checks later and it's no longer a blocker.
Somethings I include brief explanations or links to a particular piece of hard-to-find documentation.
> A senior engineer writes comments that explain why the code isn't written in another way
I suggest providing that sort of high-level decision information in a separate design document. This sort of documentation is written before the code and is presented/discussed with management and peers.
That is how it was done at a couple of companies I worked at and it was very effective. Naturally, this was done at the feature level, not on a function to function basis.
Then you end up with a design document that corresponds to the code as it existed conceptually before v1 of that code was even written. By the time v1 is written, the design doc will be slightly out of date; by the time the code reaches v3, the document is more than 2 versions out of date.
The nice thing about comments is that they're in the code. You can't update the code without at least looking at them; that's more visibilty than you'll get from anything else.
Design documents were updated as the code changed, obviously. Changing the code without updating the documents would not pass code review if somebody was shortsighted enough to try.
The kind of design doc that gets presented to management typically isn't checked into the git repo, in my experience. It's a google doc or a wiki page full of planning details, and it gets covered in amendments about whether XYZ will be ready in time and so forth.
Having documentation in your repo describing high-level architecture choices and peculiar quirks for future maintainers is nice, sure, but I think that's a separate document with different goals and relatively little overlap. It shouldn't be organized by feature, for starters.
Also, if the remarks in that document are granular enough to exist as comments, why not just leave them as comments? Then they're more visible and more likely to remain up-to-date.
That still needs discipline though - or you end up with N half-finished Confluence pages describing the intention behind the design, all of which are now out of date (and naturally in completely different places). The best way I've seen to keep track of changing things is to have the design linked to the ticket somehow (and if it's a link, then that needs to be a permalink to something that will not go away in a year's time).
I really have to push back against this "design document" stuff. Unless you're writing some sort of uber-complicated, mission-guidance-systems-level code with multiple standards and audit compliance and and and, then you don't this.
Alternatively, if your org or reach for this feature is so large that you need to communicate and decide its internals with a lot of people and as such require something like a design document, then you've already failed and you have way too-many cooks in your kitchen. Design by committee is always doomed to failure.
Thirdly... if you need a design document to communicate your feature to "management" then that means you don't have autonomy to design this feature as an expert, and again, you have too many "fingers" in your pie.
Does this mean you should go into a basement and design your feature like a hermit? No, but a design document shouldn't be your answer, it should be a better team and process around you, as well as a clear business specification.
> Design by committee is always doomed to failure.
If you need to communicate your design to a lot of people, that's probably because a lot of people are depending on the output of your work, and they want to make sure it suits their needs. That's not design by committee—that's just designing something that will be widely used.
I understood that to mean "I've tried doing it x way and it didn't work because y." rather than the functional part. At that point, why not keep the documentation together with the code?
I comment everything that I think would be useful for me when revisiting the code a year later. Usually "why" and "why not". Sometimes a short "what" when the code is complex and it's nice to see the sequence more clearly.
What's not so useful: mandatory comments. A public API should be thoroughly documented, but some shops insist on writing comments for every function in the code, even private ones and even if its purpose is so obvious that the comment just rephrases its name. This practice is not only a waste of time, but also insensitizes you about comments and teach you to ignore them.
Other wasteful comments are added by some tools. I hate the one that marks every loop wiht a //for or //try comment.
For some reason a lot of syntax highlighting color schemes de-emphasize comments, making them low contrast, which is probably because a lot of mandatory / generated comments are low information. Get rid of mandatory and generated comments, and change your color scheme to make them a bright neon colour instead (on a dark theme) to draw the attention, because IF something is commented then it's important.
I disagree with this. Comments are important but not every time you're in that part of the code. You might already know from just the code what's going on, or have already checked out the comments. There's no value to them always being emphasized. That would be like reading all NPC dialogs every time in an RPG.
If there is a comment is should be important enough to read it every single time you are near that area of code even if looking for something else in the file! They should be a reminder of something important and not obvious in the code - otherwise I'll just read the code.
Note that I distinguish comments from API documentation even though they are often both in the same code and use the same comment syntax.
If there is something really important worth rereading over and over again, put «NOTE: » before it and let your editor highlight it for readers of your code
> I comment everything that I think would be useful for me when revisiting the code a year later.
Do you code solo or do you work with a team? If so, how large is the largest team you've worked with?
I used to be a dogmatic "all comments are code smells" person and, to a large degree I still am. But working on a very (and I mean VERY) large code-base that is actively developed and maintained by hundreds of other software developers, I have relaxed my position slightly into the "if you need to do something weird, explain why" ... because a large legacy system that lives in a business environment of tight deadlines means that there are often weird things that need to be done to keep things moving at a pace that the business is willing to pay for.
Anyway, one of the many reasons that I argue AGAINST code comments is that the comments become part of the code and therefore require maintenance. But few people read comments unless they are stuck trying to understand something. This "psychological invisibility" is even enforced by the fact that most code editors will grey out comments in order to make them less distracting.
And therefore, comments can easily become outdated.
So I'm curious about your situation. Since you say that you like to give yourself useful context for "future you", what context does this process serve? Do you find it useful when working on a shared codebase with lots of other developers? Or is it something that only works well when there are few developers touching the code?
> And therefore, comments can easily become outdated.
I am firmly of the opinion that keeping the comments up to date is part of the job of the developer. And if they're not doing it, they're not doing their job. It's no different than taking the time to write code that does the right thing, expresses the intent of what it's trying to do, etc. And, when code is peer reviewed, not including/updating comments in places where they are important should be a reason to send the code back to the developer.
I feel the same way about automated testing. They're both (testing and comments) way to make our code better; easier to understand, easier to maintain, more likely to be correct, etc.
> I am firmly of the opinion that keeping the comments up to date is part of the job of the developer.
I see where this comes from, but practicality is a huge issue. Comments don't become obsolete just by laziness, but also because they are human targeted and can't easily be managed the same way code is.
For instance if you comment on a function call to explain it's needed because of its side effects, when the function gets fixed and loses that side effect your comment becomes obsolete.
Will the person fixing the inner function go back to every single comment that vaguely references it's behaviour and fix them ? And do they also go to the one-up parent function to check the comments there ? If we're assuming large orgs with expansive code bases, it's just not realisitic to be dealing with all that prose and maintain consistency when what you're modifying can be mentioned anywhere is any tangential way.
> For instance if you comment on a function call to explain it's needed because of its side effects, when the function gets fixed and loses that side effect your comment becomes obsolete.
And if you have a function call in your code that doesn't look like it does anything and don't comment it, then you're doing it wrong.
And if the person who wrote `functionWithSideEffects` changes it to not have side effects anymore, and the places that call it for the side effects don't get updated then, once again, someone isn't doing their job.
Keeping comments up to date is no different than keeping code up to date, or tests up to date, or removing code that isn't needed anymore, or cleaning code up so it doesn't turn into spaghetti, or anything else we do as developers. Does it always get done? Of course not. But that doesn't mean it shouldn't get done.
> And if you have a function call in your code that doesn't look like it does anything and don't comment it, then you're doing it wrong.
We'll leave a comment, but it needs to be accepted that it will rot.
That's where personnally I'll explicitely date the comment to give the future dev a fighting chance to understand the context, if it's not directly linked to a ticket or some other documentation.
> And if the person who wrote `functionWithSideEffects` changes it to not have side effects anymore, and the places that call it for the side effects don't get updated then, once again, someone isn't doing their job.
It will work or not. In the meanwhile the side effect might not be useful anymore and become a useless call, or it might actually be doing something else that is now a primary effect. Hopefully there will be tests or external checks to validate the functionnality isn't broken.
It won't be ideal, but it's also a reality of a huge code base.
> Keeping comments up to date is no different than keeping code up to date,
The big difference being that at the end of the day comments are just comments, and they will always be a second class citizen relative to executable and parsed code. Sure in an ideal world we could ask people to treat them as critical, but short of having an actual organization wide strict enforcement of it with successful detection of comments that haven't been properly updated,
it would unprofessional to ignore the reality of it.
For code we have static analysis, dynamic analysis, white and black box tests, and user feedback. Comments won't have that level of scrutiny, so accepting that fact and treating them accordingly feels logical to me.
> It will work or not. In the meanwhile the side effect might not be useful anymore and become a useless call, or it might actually be doing something else that is now a primary effect. Hopefully there will be tests or external checks to validate the functionnality isn't broken.
If I have a Function_A that calls Function_B because it needs something Function_B does...
- If Function_A no longer needs that side effect, then Function_A should be updated to reflect that
- If Function_B no longer implements that side effect, then changing it has broken Function_A (and either B needs to be put back the way it was, or A needs to be updated to not rely on B doing it)
Both of those alternatives result in Function_A needing to be updated if it's no longer relying on Function_B doing the thing it was relying on it for. As such, any comment discussing that reliance can/should be changed when the code is changed.
Is there some other alternative that I'm not thinking of? Can you provide a concrete example, because I'm not following you.
> Comments won't have that level of scrutiny, so accepting that fact and treating them accordingly feels logical to me.
When I go into code, especially code with lots of relationships between them (large parent/child hierarchy, etc), it can be hard to understand what the code is doing. Comments make that understanding faster. They can be a difference of hours pouring through code. Even just a comment at the top of a file indicating what the code/class in the file is for (it's purpose for existing) and what other classes it's directly related to / uses can be a HUGE speed gain. The idea of not bothering to include that information because someone can't be bothered to do a good job keeping it up to date is just... confusing to me.
Function_A needs a connection to some file server, and for system specific reasons it wants it to have no timeout. For that it calls Function_B that is targeted to batch processing boiler plate, but also extends file server time outs. Function_A isn't a batch, but will still rely on Function_B for that side effect, with some comment on that.
One day the org upgrades its infra, and file server connections radically change but they take care of preserving compatibility. But no timeouts anymore, setting them do nothing.
They have the choice to go through the hundreds of instances where Function_B was called, look at the use case, and update it, though dozens of MR throughout the whole organization. Most of it probably on code that doesn't belong to the team making the file server change.
Or they can get rid of the timeout management in Function_B, alert the org of the change and request each team to deal with it on their own terms, and call it a day. Function_A will continue calling Function_B, and perhaps the new system timeouts are irrelevant, or perhaps new optimization options were added, and it happens that Function_B sets the right flags for Function_A to work in the best conditions.
The reasons to call Function_B will have either disappeared or changed, but the code will continue working, and at some point the responsible team might want to update the comments to reflect that, but it won't be a priority.
> understand what the code is doing
To me you're looking for design documents, not code comments.
I feel your pain, while also seeing no real good solution to understanding really fast a complicated and huge project. Comments can sure helps, the same way they can also be deceptive and send new devs into spirals.
I see it as a tradeoff between optimizing for new devs accessibility and pay the price in daily development and maintenance time, or having newcomers spend a lot of time upfront in exchange for more velocity for the existing members.
I personally tend to prefer the latter orgs, and find the code quality to actually be much better. It's usually a sign the org isn't shuffling or pumping new devs here and there every thursday, and people get to spend the time to understand the code base as it actually works, without needing to take too many shortcuts (I wouldn't see asking for an existing member to walk through the code as a shortcut btw)
I think color schemes that make comments nearly invisible are just bad. Visual Studio's default color schemes have it right. Comments are not treated as lesser than anything else.
As long as they aren't being placed every couple of lines they shouldn't really be a nuisance when reading through code.
And to me not updating the comments is the same as not updating the name of a variable when its purpose changes. The compiler doesn't care that you didn't update the name, but people reading the code do.
Also when it comes to "future you", you basically should think of yourself in the future as a different person. Because unless you are constantly working with the same bits of code, you will forget details and possibly even the high-level.
Yes, completely agree. Also, too many comments make it difficult to see what is in a class/function. If the comments make a class/function that would otherwise fit on one screen no longer fit on one screen there is a readability cost to this.
I wish they would put the comments before the function name in python, as otherwise the useful code is separated by useless verbiage. I also wish comments would include examples of the shape and dtype of inputs and outputs.
Collapsing/hiding comments should be a required feature in editors.
In Leo[1], you can make nodes out of them and just hide them. If Emacs weren't so good, I'd be using Leo. Similar power when it comes to extensibility, but extended via Python, not elisp.
Related anecdote: yesterday I was on a code portion of a personal project with a "Is this really useful?" comment on a line that seemed it could easily be removed. I tried to use the newer and cleaner class instead and the particular old way was indeed needed. So I appended a "=> yes!" to the existing comment as well. I'm glad my former self documented the interrogation.
At work, especially on bugfix, I often write a one or two lines comment with the ticket issue number over a non-obvious change.
It was pretty common say 25 years ago when you would be developing in a terminal. When you were limited in the number of lines displayed, it sometimes made it easier to follow the code when functions and control structures were large.
I know I had coworkers using brief configured to do that.
I think the favourite type of comment I've ever left in my code follows this template:
DEAR MAINTAINER:
This code is the way it is because of <reasons go here>.
Once you are done trying to 'fix' this, and have realised what a terrible
mistake that was, please increment the counter as a warning to the next
person:
total_hours_wasted_here = n
I'm not the original author, but have gratefully used it once or twice, and been amused when there was a single line commit incrementing the counter.
I one time wrote this fairly elaborate SQL generation thing that required pretty liberal use of recursion in order to fulfill all the requirements. It ended up being a lot of mutual recursion and the code was admittedly kind of messy but it was a necessary evil to do everything asked of me.
A more senior engineer ended up taking over the codebase, "fixed" all my code to be this iterative thing, made a point to try and lecture me about why recursion is bad in an email, only for his code to not actually do everything required and him reinventing effectively everything I did with recursion.
In fairness, he did actually apologize to me for some of the comments he made, but if I had thought about putting this comment on the top maybe we could have avoided the whole thing.
I agree that the title is ambiguous - it's what piqued my interest to read the article in the first place. Personally I lean toward fewer comments overall - perhaps to a fault - but explanatory comments as shown in the article are absolutely valuable. It's a good reminder to explain the whys and the why nots.
This especially applies to your own code that you write and still have to maintain 5, 10, 15 years later. Just the other day I was reviewing a coworker's new code and thought "why choose to do it this way?" when the reason was 10 lines up where I did it the same way, 8 years ago. She was following the cardinal rule of maintenance - make the code look like the existing code.
This is so undervalued when maintaining an older codebase. Please, for the sanity of those who come after you - make the code look like the existing code.
"Please, for the sanity of those who come after you - make the code look like the existing code."
I think it's a great rule of thumb... but there are exceptions.
For example, Funnily enough I was reviewing some code just today written by someone else and that I'm going to be asked to expand on or maintain. It looks like this:
var example1 = Object;
var example2 = Object;
var example3 = Object;
...
var example147 = Object;
And then later there are corresponding variables assigned to different objects which are:
var tempExample1 = <some object instantiation>;
var tempExample2 = <some object instantiation>;
var tempExample3 = <some object instantiation>;
...
var tempExample147 = <some object instantiation>;
And this goes on and on. It's real special once we get into the real business logic. (and to be clear... "example1" are the actual names, I'm not just obfuscating for this comment.)
The reason it looks like this is because the original developer copied the examples from the developer technical documentation for what they were doing verbatim; this documentation only had one variable so the numbering was the way to get multiple variables. Knowing this system, they didn't have to do that, they could have very easily assigned meaningful names for all of this. (To be fair, the original developer isn't a developer but a consultant business analysist saying, "Oh yeah, I can do that!" to the client.... billing all the way).
I can tell you with great certainty and righteousness: I'm not going to make my code look like the existing code. I may well do some refactoring to make the existing code vaguely scrutable.
I appreciate that what I'm describing is an extreme case and not really what you or parent comments really were addressing. I just stop to point out that what you describe is a rule of thumb... a good one... but one nonetheless. And, as an absolute rule of thumb about rules of thumb, there are no absolute rules of thumb. Ultimately, experience and judgement do matter when approaching the development of new code or decades old code.
The concept of "make the new code look like the existing code" still applies - in the example you gave, if you need to add examples148-200, and want to do it in a better way, then it would be wrong to do that new way for the new code; either you are willing and able to refactor the previous 147 cases as well (so that the new code matches the existing code, because the existing code was updated), or you keep the existing structure.
Unless of course you work in a wild-west codebase where you can basically tell who wrote what code because everyone has a distinct style and they never converge. ugh
This is just a special case of a broader, more general advice that I follow:
Comment on whatever would be surprising when you read the code.
When I write code, a voice in the back of my head constantly asks “will I understand this code later?”. (People who just instinctively answer ‘yes’ every time are arrogant and often wrong.) Whenever the answer is ‘not sure’, the next obvious question is “why not?”. Answering that question leads you directly to what you need to write in your comment.
Sometimes the answer is “because the reader of the code might wonder why I didn't write it another way”, and that's the special case this article covers. But sometimes the answer is “because it's not obvious how it works or why it's correct” and that clearly requires a different type of comment.
As an addition, if you first try to write the code one way, and then it doesn't work and you need a second approach, that's a really good indication that you want some kind of comment there. You were surprised while writing it, so if you'll forget that surprise in a year, you'll be surprised by reading it too.
My driving principle in the same vein is "where do I have to look when/if this doesn't behave as expected?" - if the answer is not in the docs (wiki -> package/module -> file -> class -> function / method) or is otherwise > 10 lines away, it gets an inline comment (or the docs are updated). Usually this happens when chopping up strings or during intermediate navigation steps over an odd data structure.
> I see more people arguing that whys do not belong in comments either, that they can be embedded into LongFunctionNames or the names of test cases. Virtually all "self-documenting" codebases add documentation through the addition of identifiers.
Identifiers can go a _long_ way, but not _all_ the way. I personally am a fan of requiring documentation on any public methods or variables/fields/parameters (using jsdocs/xmldoc/etc). Having a good name for the method is important, but having to write a quick blurb about what it does helps to make it even clearer, and more importantly, points out obvious flaws:
* Often even the first sentence will cause you to realize there's a better name for the method
* If you start using "and" in the description, it is a good indication that the method does too much and can be broken down in a more logical way
People often think properties are so clear they don't need docs, then write things like:
/** The API key */
string ApiKey;
But there's so much missing: where does this key come from? Is this only internal or is it passed from/to external systems? Is this required, and can it be null or empty? Is there a maximum? What happens if a bad value (whatever that is) is used? Is there a spot in code or other docs where I could read more (or all these questions are already answered)?
This is stuff that as the original author of the code you know and can write in a minute or two, but as a newcomer -- whether modifying it, using it, or just parachuted in to fix a bug years later -- could take _hours_ to figure out.
I often write comments like this when I can predict what an overly nitpicky reviewer will say in a code review - "I didn't do X because Y" hoping to save some annoying back and forth about it.
They have value in the code because they save time when someone has to deal with that code a couple of years later. Certainly the explanation could be in the code reviews or the commit message, but it's easiest if it is right there.
It's an obvious cliche, but that "someone" is very frequently you. It's really easy to forget why you made some non-obvious decision and waste time poking at the exact same stuff you already did a year or two ago. It's happened to me several times.
When you figure out something tricky, leave a comment.
completely agree every person who leaves detailed “unneccessary” comments like this has been bitten by coming back to a codebase a year+ later and going “who was the idiot that wrote this and why didnt they leave any clue behind” and realizing that yes, you were the idiot. or has had to come behind someone that left zero documentation or readable code and been tasked with cleaning it up. breadcrumbs are useful and comments cost nothing. yes, there are commit messages, but commits often aren’t super clean, explicit, coherent, or even looked at.
Yup. Our operating principle was that if a question was asked in a code review, someone will likely have the same question when reading the code weeks/months/years from now and there should be a comment.
This is a great idea for a lot of things! Sometimes I will write a loop I know is slow as hell but works for now, and it would be cool to use a timer and do, "if this takes more than X seconds, log a warning." I mean, the perfect logging and observability should theoretically time all the components of your app, log debug information often, and etc., but who really uses a perfect system like that? I think it's more important to make specific effort to log in areas where performance might suck later or where you didn't have time to optimize or do exactly what you wanted to do.
Your comment is super obvious, in hindsight, but I never thought to do something like this, usually at past places of work we've just agreed to make a comment on things we need to reconsider later and hopefully we remember about that comment when things go downhill!
I'd caution that it's a very "80/20" thing, since the cost/complexity of detecting all "unusual" cases can easily become not-worth-it.
For example, if the slowness of the method depends on non-trivial aspects of the input, or if the performance problem in production comes from someone calling your method too many times on (individually) reasonable chunks of data.
I find that a lot of well-meaning tools (e.g. bazel) regard non-error output as noise to be hidden. They'll often route any messages like this to the nearest wastebin they can find, which makes logging incredibly unreliable as a means of communicating constraints in some domains.
I'm thinking the log message is something you wouldn't want to trigger in production anyway, unless you're pretty confident that it won't lead to a ton of spam on the day that inputs rise over the threshold.
Using a debug log-level helps prevent that problem from occurring in production, while still making it visible to developers that have more logging enabled in their dev/test environments.
I personally don't care what anyone says, I use comments and doc comments ALL OVER the place; I do it in reverse, though. I write a list of steps for the application as comments, a rough draft at first, then as I develop the code I take the big steps and split them into little steps, sometimes removing the original comment and sometimes not, and I continue to split comments into smaller steps until I have nearly a complete algorithm. Then I just code the logic in there. I normally will code from the outside in, so I'll also be writing code as I do the comment-splitting stuff. Sometimes I get off on a tear and I code a bunch of stuff at once, but then later I go back and comment it down to a level that I think most of you would find annoying. Every function and variable has a comment about what it does, even the `deg_to_rad` function has a comment `"""Converts degrees to radians."""`. Why not, storage is cheap!
I know most people don't like it, and that is fine, they can deal with it! I they don't want to see my comments, they can remove them from their version of my code with a script, and if my co-workers and boss don't like them they can remove them in a code review! However, I can say that I enjoy reading my old code way more than I enjoy reading other's code which have zero comments. I work in Python, so a lot of the simple non-algorithm code (boilerplate stuff for apps, like flask APIs for example) is mostly "self-documenting" since the old saying goes, "write some pseudo-code and 95% of the time it runs in Python." The most important comments are sometimes on the boilerplate stuff because that's where a lot of changes happen versus the algorithms where I find there is a lot more wholesale rewriting in my industry.
I also tend to do so but only for top-level constructs, where such comments are most useful. I think I do so because I like to conceptualize entirely within my mind; I found it rather slow to experiment with various design choices by actually writing them down, at least initially. So I necessarily have to document these design choices once they are settled (as others don't yet have access to my mind :-), while I expect details become much clearer after others also have conceptualized them to their minds. This approach does have a downside of making it possibly harder to read when such conceptualization couldn't be done for any reason, so I do additional tweaking to maintain the baseline readability to the minimum required.
Comments are great when they are well maintained. But for every codebase that isn't basically some open source software where the eyeballs-reading to comment-maintaining ratio is really high and people spend the time, everyone eventually forgets and/or is too lazy to maintain them. It can be almost as much work updating the comments as adjusting the code a lot of the time. So the ground truth reality is that comments are usually "lies waiting to happen". Eventually, the comments and the code won't be in sync, and this can be potentially worse than having either bare, uncommented code, or better, having actual automated tests that show intent. The tests mostly can't lie, otherwise you wouldn't have merged them presumably. If you show me a bunch of decent tests laying out how the code is intended to be used, that is what I want to see. Because it is explanatory AND it is nearly guaranteed to be truthful.
I'll take the out of date comments. They provide a red flag that either the original author was confused, or the behavior changed over time. At least then you can do code archeology to piece together descriptions of what the original intent was, and how it changed using the commit history and figure out where things went off the rails, and thus determine how to actually fix it rather than patch over it even worse.
The problem with unit test as documentation is that over time they end up reflecting the same misconceptions that the code has. Someone does a refactor, misunderstands how the original code works, and "fixes" the unit tests to pass. Now you have tests that lie just like comments can lie.
Neither comments nor lacking unit tests are going to fix that, that's a strawman argument. What I'm saying is if someone doesn't know 100% what their doing, which can be relatively often, then I would hands-down take a bunch of unit tests over some misguided comments because at least the unit tests have to /pass/, so there is some grounding in reality. A comment has absolutely no such requirement and therefore can stray in any direction forever. Also, if you refactor a name of a variable or inputnor function, you will be required to update the tests, so they are dar safer from becoming stale due to refactoring compared to an opaque string from the IDE's point of view.
I never found that to be a problem in practice. Yes, comments do get slightly out of sync, and not always corrected. But typically enough comments are correct that you can make sense of the whole thing, and can even fix the comments then.
By contrast, I have really never seen truly self documenting code. The comments may be up to date by virtue of not existing, but the end result is just more confusing. YMMV
Same, my coworker used this argument when I tried to get them to write more comments. The number of hours I've wasted due to there being no comments at all is way, way, way more than the very small number I've wasted due to a no-longer-accurate comment. And I usually find that people who write self-documenting code choose variable/function names that might make sense to them at the time, but are vague and confusing to me when I need to read the code.
Same here as well, I hear this argument a lot, but have yet to see a comment I didn't appreciate. And if there's something wrong with the comment, fix it! Self documenting code is a joke.
Code that breaks is much more likely and damaging than comments that are out of date. In my life I've never had a case where the bad outweighed the good for comments.
Also, if you're doing regular code reviews, wouldn't that include making sure the comments are up-to-date?
As for the argument that it wastes time to update comments, I've never seen a codebase where the comments were so voluminous that it would be a significant burden to update them. Devs avoid updating comments because there is usually no penalty for doing so (boss doesn't call them on it, and the code still runs), and they think they've saved a few minutes when in fact they're just kicking the can down the road for the next dev who has to decipher the incorrect (or missing) comment.
Code breaks because people read the comment and assume it speaks truth when in fact it's full of lies. And as soon as you don't trust comments you have to start reading the code. And if you're at that point now..... Why have the comment?
Comments don't replace reading code, they prime your brain for the code you're about to read. There is often context and reason behind the code that is not in the code itself.
If you ever trust comments over code you really need to adjust your approach. Comments are there to help guide you towards the intent of code, not to let you ignore code. Same reason why if a function has a bug, you don't ignore that bug just because the function's doc says otherwise.
> I know most people don't like it, and that is fine, they can deal with it! I they don't want to see my comments, they can remove them from their version of my code with a script, and if my co-workers and boss don't like them they can remove them in a code review!
It was all great until you got here. This is a big red flag for me for a teammate.
Half the time I start by just writing comments explaining what I'm about to try and do, then I go back and add comments about how things did not go as expected, and what I had to do to get it to actually work. Super helpful 5 weeks later when I have to actually see it again.
I don't think doc comments are warranted in every case and definitely don't think you need to document every single parameter, return value, etc. for every single function, but they sure do come in handy for complicated APIs.
You're wasting so much time doing this... Nobody will ever read 99% of it. Then the comments will get out of date and it'll end up worse than having no comments.
Ah yes, Comment Driven Development (CDD), I practice it as well.
You start with comments of how you want things to work, and fill in the code. It's a perfect combination of why/how and communicates to the next developer the high level thought process behind the code perfectly.
I ascribe to the notion that 'comments are apologies' (to my future self).
If a piece of code is weird, or slow, or you'd say "yeah, it's kinda janky" when describing to somebody, I usually write a comment about it. Especially if I've changed it before; to document some case that didn't work, or I fixed, or whatever.
When you operate on this basis, superfluous comments just melt away, and you typically end up documenting 'why' only when it's really necessary.
Try it out in your own codebase for a month and see how it feels :)
I'd say that it is one of the few cases where comments are the best solution.
You can't have functional code for what isn't done, so that's some information you can't express in code.
Furthermore, a major problem with comments is that you can't debug or test them. There are many tools that can analyze code, static or runtime, but because comments are just free text, you can't do much besides spellchecking. Also it is common to forget to update comments, and with the lack of testing, in the end, you get lies.
But here, the only maintenance these comment need is to remove them if they stop being relevant. For example because you finally did the thing you said you wouldn't do, or just scrapped the part. Very low effort, also rather easy to spot, since if you see a thing being done with an explanation on why it is not done, it means someone forgot.
It is also worthwhile because as programs grow, they tend to do more, and "not" assumption may no longer hold (there used to be 4 parameters, now there are 10000...), meaning it is something you should check.
A lot of slow code comes from an assumption that N will be small, and N ends up being big. By the way, that's why I favor using the lowest O(n) algorithm that is reasonable even if it is overkill and slower on small sets. Because one day, it may not be. If for some reason, using a low O(n) algorithm is problematic, for example because the code is too complex or the slowdown on smaller sets too great, then comes the "why not" comment.
In his specific example, a comment is not the best solution. You can create self-documenting code easily with an interface and an implementation, also commonly referred to as the strategy pattern.
Interface expresses “what the function does” while the implementation expresses “how it’s done with what tradeoffs”.
Bonus, in the future, if performance ever does become a problem, you swap in the new optimized implementation behind the existing interface.
Interfaces are not free. Depending on the implementation, and how you use them, it can have significant runtime costs, some compile cost (though usually negligible) and most importantly, code complexity.
When you see a call to an interface, you don't know what the code does concretely, you only know what it is supposed to do. The actual implementation and how it ties to the interface may by in a completely different place. It is one of my biggest source of headaches when debugging code.
Interfaces are useful, the strategy pattern is useful, but overuse is harmful. My idea is to not use an abstraction unless I know I am going to need it. For example, let's say I need to decode video, and I have a hardware decoder and a software decoder. Here, an abstraction make sense, as I know the software decoder will be prohibitively slow one some platforms, and the hardware decoder won't always be supported. But if the optimized version is sufficiently better so that it makes the old version obsolete, just change the code.
And if I don't know if I will need to change strategies later, I just write the first strategy, and if it calls for an interface later, only then I will do the abstraction.
You don't need a function, a simple {} block is enough if the language supports it. Some languages (ex: Python) don't have that though, you may need a function, or some other construct (ex: if true) in this case.
I resisted putting comments in my languages for years. My reasoning was it was always a flaw of my code (or the language) if I couldn't express myself in typed code.
Then I realized that my languages will never be perfect, and having comments is an essential escape hatch. I was wrong and I changed my mind.
> This is incredibly inefficient and I could instead do all 16 replacements in a single pass. But that would be a more complicated solution. So I did the simple way with a comment:
> Does 16 passes over each string
> BUT there are only 25 math strings in the book so far and most are <5 characters.
> So it's still fast enough.
I've been in this exact situation quite a few times — use a bad algorithm because your n is low. However, instead of commenting, I did something like this instead:
function doStuff(items: Item[]) {
if (items.length > 50) {
logger.warn("there's too much stuff, this processing is O(n^2)!");
}
// ... do stuff
}
> When I was first playing with this idea, someone told me that my negative comment isn't necessary, just name the function RunFewerTimesSlowerAndSimplerAlgorithmAfterConsideringTradeOffs.
Wow, someone actually suggested that?! Do people write whole programs like this?
There are many, many developers who are, deep in their hearts, "programmers"--someone who creates a plan of action, a program, for very complex machines to follow. They are skilled at discerning The Right Way to solve a problem. Actually, THE Right Way. And since they are Programmers, the Right Way is to find the Best Way to tell the computer what to do. Computers are completely oblivious to comments, so they aren't The Best Way. Comments are ambiguous, so they can't be THE Right Way. Computers parse identifiers, so a good variable name is telling the computer _something_ at least, and so is vastly preferrable to comments. If there's a mismatch between comment and code, the code is Reality so the comment must be wrong, and is thus misleading and worse than useless.
In reality, of course, software development is about getting large groups of people in-sync as to problem, solution, and implementation. That takes lots and lots of communication. Ambiguous, messy people-to-people stuff, without The Right Solution to contents or wording. Because there's a bias towards thinking of the Program as Reality, it never occurs to such developers that an identifier can become just as outdated or wrong as a comment, and in fact it is easier to correct a comment than to globally rename identifiers, or that a 20-word comment carries vastly more information and nuance than a 20-character function name, or that trying to shoehorn information important to developers into a language used to tell _computers_ how to function is worse than trying to drop assembly language into a SQL query.
> Computers parse identifiers, so a good variable name is telling the computer _something_ at least, and so is vastly preferrable to comments.
I realize you're trying to explain a way of thinking that you don't actually share—but this doesn't make sense to me either.
Computers parse identifiers in only the simplest sense—they care if two identifiers are identical to each other or different. So, any identifier longer than one or two characters is in some sense a comment, because everything longer the minimum necessary to make the identifier unique is ignored. They're literally stripped from the output if you don't retain debug symbols or use a minifier (depending on the type of language).
I wasn't being sarcastic, if that's what you mean. (And I'm not a professional programmer, so my coworkers don't write code.)
To me, the suggestion seems so incredibly bad that it leaves me wondering about the headspace of someone who would suggest that. Is it somehow less bad than it seems?
My rule of thumb has been "comment stuff that isn't the naive solution", basically anything that would make someone think "wtf is this" the first time they read it.
My biggest headache right now has been getting high-throughput with SQL and as such I've had to do a lot of non-obvious things with batching and non-blocking IO in Java to get the performance I really need, and as such a of the "obvious" solutions don't work (at least with a reasonable amount of memory). Consequently I've been pretty liberally commenting large segments of my code so that someone doesn't come in and start bitching about how "bad" my code is [1], "fix" it, and then make everything worse by rewriting it in a more naive way that ends up not fulfilling the requirements.
[1] I have since stopped doing this, but I'm certainly guilty of doing this in the past.
I find myself following only two or three different patterns in comments:
There's often a fairly small kernel of very dense code that abstracts away a bunch of complexity. That code tends to have well north of a 1:1 comment to code ratio, discussing invariants, expectations, which corner cases need special handling and which ones are solved through the overall structure, etc.
Then there's a bunch of code that build on that kernel, that is as close to purely declarative as possible, and aims for that "self-documenting code that requires no comments" ideal.
Finally, there's the business logic-y code that just can't be meaningfully abstracted and is sometimes non-obvious. Comments here are much more erratic and often point at JIRA tickets, or other such things.
Apart from what/why/why not: Something I started doing recently is to put URLs in the comments:
- URL of documentation for a complex feature
- URL with a dashboard with telemetry
- URL of a monitor which checks if the given feature, CI job, GitHub action etc. works correctly.
In a big project, figuring this stuff out is not trivial, requires a lot of searching with proper search terms and/or asking the proper knowledgeable person.
I find it weird that code is often so detached from everything non-code.
One issue with comments is that an engineer who struggles to write clearly will also struggle to write clear comments. And clear code. But at least the code can be deciphered by reverse engineering it, stepping through it, or in some cases rewriting it. For me, at least the comments are worth reading before trying to figure out what someone's code does, to get me into the ballpark.
I often add comments to code as I decipher it, then remove them again when I figure things out.
An engineer who struggles to write clearly is one I would hope not to hire. Programming is first and foremost a problem discovery job, then a writing job, then math and logic.
No matter how efficient or eloquent, if you solve the wrong problem, you’re doing no one any good.
Solve the right problem efficiently, but in a way no others can understand, and you’ll flounder in a team. You will also never be able to collaborate or make larger work move faster.
I definitely agree, but still, a lot of people are told "you will never use your math or writing after college." It may also be that they perceive themselves to be judged by how much code they can sling, and not the quality of their writing. And in a large enough department, the work that depends on math ends up being given to the one or two "math people." I'm one of those.
But I suspect that there's a relationship: If their writing is hard to read, their code will be hard to read.
Although I had started programming when I was nine, in high school I was fortunate enough to have a computer science teacher with a PhD in the field. Among one of the habits drilled into me was extensive commenting.
Every function (or procedure) starts with a comment block. It first talks about the what and why. Then, a line for the inputs and another for the outputs. Next -- and this is done closer to the end of the writing -- I describe what it calls and what it is called by. The comment block optionally finishes with room for improvement.
The function itself probably has other comments. Usually for anything which is not blindingly obvious. Because I write code like a caveman, wherein only one thing happens on one line, most everything is quite clear. If there's anything weird or magical that has to happen, it gets a comment.
Elegance and cleverness is reserved for data structures, algorithms, and so on, rather than doing a lot of stuff in as few lines as possible. I do this for Future Me, who might be having a bad day, or for anyone who wants to adapt my code to something else.
One of the last steps in a finished program is going through and making sure that my comments match my code. I am a very boring kind of programmer.
Another big one I wish I saw more often is "these are the circumstances under which this assumption was made and here are the steps you can take to check if those circumstances have meaningfully changed."
In other words, the comment allows the author to reach into the future and co-debug with the reader, even if the author is no longer there.
I have learned to enjoy narrating my struggles in the comments, probably to excess; but it certainly makes it much easier to pick up a task again after I leave it alone for a week or two.
People rarely touch what I write, but if they do, and they want to strip the comments out, thats totally fine with me, just don't ask me how it works after you do :P
I have had the experience this year of working on a system with 150k lines of C ant 150k of Typescript. The gentleman who wrote it in 36 months quit and here I am.
He did not believe in comments, much. I think he thought he was commenting the hard stuff, mēh, I am unsure.
It has led me into strong opinions.
* Document every function. The function should make clear what the preconditions, preconditions, and the purpose are
* Docunebnt every file/code unit. Why it exists
* Document important loops like functions
* Document the easy stuff. It is not easy if unfamiliar
* Review the comments when working on code
This would have saved my company about thirty percent of my time.
The compiler does not verify comments, like it does code, so it is a burden. Bad programmers get another opportunity to sow chaos, I know. But one of the main purposes of code is communication with following humans, as well as controlling machines
Careful and thoughtful comments are a professional obligation IMO
When I see a sequence of string replacements, instead of performance the main thing I worry about is if the output of one replaces matches a pattern for another replacement. I see variations of this often during code review.
Doesn't seem to be a problem here though because they're replacing macros by symbols that are known ahead of time.
Another approach is ADRs, which document alternatives considered, but these are documentation, not comments. I've found them useful for building consensus around architecture decisions.
The title's a little odd, unless it was to grab attention. It's saying "Why [I use] 'not comments'", not asking "why not comments?" A "not comment" here is an explanation of why the programmer didn't choose the obvious approach. I agree: that's a very valuable thing to document for the next person.
For instance, you might write something like:
# I used a bubble sort instead of a quick sort here because
# the constraint above this guarantees there will never be
# more than 5 items, so it's faster to use the naive
# algorithm than to implement a more complex algorithm that
# involves more branching.
or
# Normally we'd do X, but that broke customer Y's use case
# based on their interpretation of our API docs which we
# had kind of messed up. So now we do Y because it works
# under both interpretations, at least until we can get
# them to upgrade.
Basically, tell your audience why you're not using the expected method. It's not because you didn't know about it, but because you do know and you've determined that it's not a good fit for this use case.
The problem with the title is simply that the English language allows open compound words. I know many Germans wonder why we allow this. Germans push the words together, for clarity. I've suggested that we use hyphens. Hyphens feel natural in English, and could remove the ambiguity that exists whenever we use open compound words (that is, open-compound-words). In this case "not-comments" would have added clarity.
Likewise, the title "World's longest DJ set" was confusing, because most people will assume that the compound word is "DJ-set". But if you read the whole article, then you realize that a python snake fell on the mixing board and accidentally mixed some tunes. So the compound word was actually "longest-DJ" -- a 2.5 meter python.
We should all consider using hyphens for all compound words.
>I've suggested that we use hyphens. Hyphens feel natural in English, and could remove the ambiguity that exists whenever we use open compound words
We do use hyphens in English. Well, some of the time, and some of us. I could be wrong, but I do feel that, given my age and also my readings of older texts, that the use of hyphens in this way has become less common, and that this was much more common decades ago to avoid ambiguity.
"longest DJ" isn't a compound, it's a noun modified by an adjective, and as such it would be written as two different words in German as well ("längster DJ").
Because the article is a rebuttal to those who oppose writing comments (that is where "why not comments" means "why you shouldn't write comments") where the main argument is that you should write "why not" comments (that is "why not comments" means write comments explaining why you didn't do something or do something a certain way).
You can still document the why in commit messages![1]
I feel like I’m getting off the self-documenting code ride. In our own codebase we rely way too much on “descriptive names”. Like full-on sentence-names. And is the code self-documenting? Often not. You indeed cannot describe three or more axes of concerns in one name.
Do comments go stale? Well why does it? Too loose code reviews? Pull requests that have fifty lines of diff noise that you glaze over? We have the tools to do better on that front than some years ago at least.
It’s a joy to find a corner of the code base where things are documented with regular sentences. Compared to having to puzzle through five function call layers.
[1] But yeah, really. But also: sometimes also in comments. Sometimes both.
I mostly like a comment with a succinct explanation and a longer commit message. Comments are less likely to be refactored and lose the immediate git blame, while the function might change over time.
Ideally the future user would trace the git blame back to the original commit of they really had questions.
A long comment is really helpful sometimes though. I like to put ascii truth tables for complex boolean logic, both to ensure I cover all cases when writing the code (tests, too) and to make it easier for future me to understand what's going on at a glance.
Right, commit messages are at least always right for the context that comes with them. I think long descriptive names are an anti pattern. As per word puzzles you can't actually read the middle of a blob of text so you are basically left with a dozen variables that are cognitively the same as isDatabaseHidingDragonsSetter().
Mine do this as well, but in PHP. They adopted docblocks over a decade ago and never defined any best practices or used any tools to check them, so comments don't even match the code half of the time. We're on PHP 8 now which can enforce types in the code itself but still have to include docblock types because that's how it has always been done, so the problem persists. It's maddening
Personally I think you're better off putting 'not comments' in git commit information. Git commit comments are like normal comments except they don't get detached from the code.
I believe this can work in slow paced opensource projects which do squash-merges (one PR = one commit), every commit is green and beautiful and well described with properly formatted message.
I worked on such project in my first job and I cared a lot about my commit messages, and changelogs.
On the other hand, in fast paced corporate development I barely ever see someone make commit messages like this. Even PR titles often leave a lot to be desired.
Squash merging is a practice I mostly see recommended (I mean: mandated) from those corporate environments which churn out code like a paper mill trying to put the Amazon out of existence. OSS projects that can take their time with nice commit messages—which even review them, not just the code—don’t have to limit themselves like that.
Do you really need to comment to yourself or anyone else that you did a thing the less efficient way because it was simpler? I guess I"m not seeing the use. When later you've perturbed the criteria that caused that function's inefficiency to be immaterial to overall performance needs and it suddenly becomes material, profiling is going to reveal the low-hanging fruit faster than digging through the code for a “this could be faster a different way” comments.
I once wrote a bubble sort into production code, because the total number of elements to sort rarely exceeded 4 and it was what I could do off the top of my head. I don't remember if I left a comment explaining the reasoning, but I think I did. A year later a new feature invalidated my assumption about the number of elements and the sort was way too slow. I'm sure the person who inherited that code cursed me a few times.
I would call this “defensive coding”, analogous to defensive driving. And comments that talk about something that isn’t there is way too defensive, kind of like the excessive braking and overly polite waving that can make 4 way stops so confusing. Write your code clearly, keep your functions wholly visible without scrolling, keep the comments to a minimum, and put the ideas that can’t be expressed as code in a README.
May also sometimes, when designing algorithms, be useful for documenting pre- and post-conditions and invariants in procedural languages that lack embedding such specifications. You can't really reason about these things in the language itself and end up having to use something else (predicate calculus usually) but it's nice to at least have an indicator, as tfa suggests! What code doesn't do is important!
> In recent years I see more people arguing that whys do not belong in comments either, that they can be embedded into LongFunctionNames or the names of test cases.
Who is arguing this? Usually, if I'm adding a comment on why something is done a particular way, it's something that is going to take at least a full sentence to explain if not a whole paragraph.
I am working on the abandonned idea of a rapgenius for code. I think it still is a useful idea for the open source world. You can join the HN learn discord [1] if you want, so I can keep you posted.
Comments and docs are lies that I dearly love. I want and need them, but I never forget that they're, at best, helpful lies. The code does exactly what's written, the comments adhere more or less, sometimes.
Maybe what we need is a single vocabulary word that means “I’m doing something that won’t scale well to large inputs but is still worth writing for now” then you could name the function replaceEscapeCharsNewWord()
I've named these things "naive" sometimes. Like "naiveQueryBuilder" or whatever the appropriate term would be. They're also useful for creating tests because the naive version is usually "obviously" correct (still write some tests, but you don't need much more than sanity checks) and can be the oracle for the faster version (you might want to cache results rather than run the naive one each time, though).
There is no hope for the software industry to mature if we cannot agree on some basic coding practices. Like the judicious use of comments to improve maintainability.
in the age of AI and Cursor, I make my function name as expressive as I can, and I make sure to add a couple of lines of comments, either generated or manual.
It makes it way easier to send these into Claude (it seems, at least). I hope they introduce a semantic/vibes search too as I can never remember what I name my classes and functions...
This seems a bit like a strawman argument. I don't think anybody say that we should never use comments.
The problem with comments is that they can become stale, and it's often possible to self-document or write simpler code that causes less surprise. But of course, it's totally fine to put comments.
And I think comments should be mandatory for interfaces functions/types unless their behavior is obvious. I don't want to read the code to understand what a function does, or what invariant a class maintains. And if it's too complex to document in a few lines, probably this isn't the right interface. But apparently, this isn't obvious for everybody. In my company, most of the code isn't documented.
Comments tend to get outdated quickly as your app grows. If you are not careful with your phrasing you might even introduce misinformation into your app. I'd rather read the code instead of comments.
I love when a docstring saves me from reading code. "Takes parameter x which should be in state s, has side effect e, returns j, throws exception if .." Seeing that in my IDE popup and moving on with all the info I need is so satisfying compared to having to waste my time on an unwanted spelunk. It's also impossible to embed all this in a name that isn't a camelcase paragraph.
That works if the code wasn't written by, well, lunatics who decided that clear simple code was The Work Of Satan and if you couldn't have a tower of interfaces or write your own bafflingly complex ORM, what was even the point? Then the people who follow will almost certainly need some "what" commentary (as well as the "why" but IME the lunatics who spurn clear and simple code never think they have to explain "why" either.)
Oftentimes, the reason for that is that 20 years ago stuff like Doctrine, Liquibase or whatever just didn't exist. You know, the time when PHP developers shipped straight mysql_query calls with direct interpolation of $_GET, and most "enterprise" Java application came with a ton of SQL scripts and a dedicated multi page UPGRADE file explaining in which order you had to run the schema migrations, reboot systems, run manual migration scripts and whatnot to get an upgrade done. Some times, upgrades could literally take days.
Naturally, people invented their own stuff to make stuff just suck a little bit less, and it got more and more used in a company, only ever extended in functionality... the dreaded "corpname-utils" JAR dependency (if you're really unlucky, the JAR having been semi-restored from a half-broken decompile because the sources got lost along the way) or util.php that just got copied over from project to project. And that's how you end up in 2024, still maintaining some ORM that has its origins in Perl code written in the 90s by someone deceased in the '00s. (Yes, I've been there, although not that bad)
In my first job, my older colleagues, most of them now managers, had managed to write their own library. It included its own timezone management, a wrapper o top of DEC's OSF/1 AXP concurrency primitives, a realtime memory-mapped database format, a compiler that run not on files but in expressions stored in an Oracle databse, and even their own CORBA-like object sharing over TCP/IP.
These people were wizards, and probably did a lot of stuff just to show their coding prowess, but a decade later when I joined that company most of the younger programmers did not dare to touch that code. I had to do that when the software was being deployed in Brazil, where nobody had expected how DST changes in the south hemisphere.
> Oftentimes, the reason for that is that 20 years ago stuff like Doctrine, Liquibase or whatever just didn't exist.
Sadly I am talking about something written in Perl (blessed with ORMs since at least 2001) around the early-mid 2010s. Not a single reason it should have been written. The only thing it gave us over well established Perl ORMs were code that no-one understood, no support for that code, and a panoply of infuriating bugs that constantly broke production.
Most comments belong in the git log message.
That's where you want to discuss the "why not".
You have all the space you need in order to do that, without cluttering the code.
The log message will accurately pertain to the change made at that time.
When commits are rebased, the log message must be revisited and revised. Changes can disappear on rebasing; e.g. when a change goes into a baseline in which someone else made some of the exact same changes in an earlier commit, so that the delta to the new parent is a smaller patch. In my experience, commit messages stay relevant under most rebasing.
Comments are (largely) an obsolete version of version control log messages.
In the 1980s, there was a transitional practice: write log messages, but interpolate them into the checked out code with the RCS $Log$ thing. This was horrible; it practically begs for merge conflicts. It was understandable why; version control systems were not ubiquitous, let alone decentralized. You were not getting anyone's RCS ",v" file or whatever.
Today, we would be a few decades past all that now. No $Log$ and few comments.
Mainly, the comments that make sense today are ones which drive automatic API documentation. It would not be reasonable to reconstruct that out of the git history. These API comments must be carefully structured so the documentation system can parse them, and must be rigorously maintained up-to-date when the API changes.
How am I supposed to be aware of a commit message specifying the “why not” when reading the code later down the line?
I could easily imagine somebody refactoring code to an obviously better version, finding out it doesn’t work for subtle reasons, running got blame and cursing the person who left that information in a commit message instead of a comment.
That's a different kind of why not: about why we didn't do it in this seemingly obvious incorrect way, but had to do it in a more complicated correct way.
If that obvious incorrect way happens to match what you're thinking of doing, then that kind of why-not documentation will help.
Warning future developers not to make certain changes is indeed something that should go in a comment. We can make a short comment like, /* if you think you can rewrite this in a simpler way, read commit df037acb first */
There are unfortunate situations when things have to be changed in multiple places together; you must not forget all of them. That deserves comments.
If you have an enumeration that has to pack into 3 bits, there should be a comment warning not to add more elements than eight without increasing the number of bits; especially if there is no compile time assertion for it.
I think I get it better now but I’m still not finding why keeping the whole comment in a commit message is better than in the source code. If it’s in source, I don’t need a context switch to read it, I can improve it (e.g. if the wording was ambiguous), I can easily extend it without asking people to read commits X, Y and Z in order.
"A junior engineer writes comments that explain what the code does. A mid-level engineer writes comments that explain why the code does what it does. A senior engineer writes comments that explain why the code isn't written in another way."
(except punchier, of course. I'm not doing the quip justice here)