I saw someone quip (on twitter, I think) many years ago something like:
"A junior engineer writes comments that explain what the code does. A mid-level engineer writes comments that explain why the code does what it does. A senior engineer writes comments that explain why the code isn't written in another way."
(except punchier, of course. I'm not doing the quip justice here)
> writes comments that explain why the code isn't written in another way."
Exactly! I have written code that required comments five times as long as the code itself to defend the code against well meaning refactoring by people who did not understand the code, the domain, or the care needed to handle large dynamic ranges.
I have also written substantial functions with no comments at all because it was possible and practical to name the function and all the variables so that the meaning was clear.
It's true! If you come across a well-documented and complex piece of code, it's always easier to delete it all and write a simpler (and less correct) replacement than reading all of that code and documentation. And since your replacement is simpler, you're free to delete the inapplicable documentation that you saved so much time not reading!
If you didn't back that documentation up with some tests to reflect the necessity of that complexity, this is an avoidable tragedy. But if the junior then comes in and deletes your tests because they don't pass, that's a firin'.
In the end we had sufficiently good tests to detect such foolishness. But even that isn't good enough because if the result of the test is merely slightly off some people will just adjust the assertions!
I do all of the above. Summary comments are incredibly helpful and have been validated by empirical research. Too bad many have drunk deeply from the Clean Code koolaid and can’t be saved.
I got some flak at a prior job for saying I had some quibbles with clean code (a few years after I had read it), and I'm glad this opinion is more popular today. There's so much cargo culting hype with "best practices" and style, I hate it. Same with how overly dogmatic people were with OOP paradigms when it came out (remember using anonymous interfaces to pass a function around in java?). Same with the functional backlash to that.
It's fun and enlightening to go ham on any particular style/framework/philosophy, but actually living by dogma in prod gets kinda dangerous and imo is counter to the role of a senior+ engineer
The very term "best practice" is such a loaded term. It implies:
1) empirical measurement compared to a large number of alternatives ... the empirical measurement or study is never mentioned: because they do not exist
2) the best practice is valid in all measurements and criteria of comparison: performance, elegance, simplicity, correctness
3) since there is no data, the reasons for why the practice was designated best are rarely even explained
4) nor are the circumstances or individual or source of "best practice" detailed
5) it will always be the best practice: it is the BEST! It CANNOT be improved.
So we have an unsubstantiated, unargued, unsourced, non-authoritative, exaggerated declaration in virtually every case of "best practice"
This is an example of taking something that is contextually good advise, applying it to all situations, which turns it into bad advise.
If you can make your code more clear, so that comments aren't necessary to explain what it does or how it works, that is probably (but not always!) something you should do. But that doesn't mean you shouldn't have comments. At the very least there should be comments explaining why (or why not) things were done a certain way.
This is fine. A problem arises when you assume that this will always be the case, for all developers, and then mandate that they omit comments, _without checking if your assumption is true for their case_.
IMHO that rule can be generalized: Whenever you make rules for other devs, make sure that the assumptions on which those rules are based are true, lest you interfere with their work in a negative way.
A line of code can tell you what it does but not why. Unless you are on a newish codebase, you will likely need comments to explain why certain decisions were made
well-named works great while you're writing the code. come back to it in a few years, or hand it over to somebody new, and you would realise that what looks like a good name to you means nothing to somebody else.
That's a horrible way to name a function. Function names should be short, punchy, and unambiguous. They should create a simplified abstract narrative, and all the details should be put into the docstring, so that they can be easily accessible without having to (a) be squashed into an identifier, or (b) be repeated every time you want to call the function.
There are indeed those situations where a comment would not increase the clarity of the code.
But one shouls be careful not to mentally think of this as a zero sum dichtomy, where you either have well named functions XOR you have comments, because in reality choosingn both is often the golden path to success.
The danger is of course that code that is totally obvious to you now will take far more time to become as obvious later, be it to your future self or to your psychopathic lunatic co-worker who knows where you live.
So very often code can be made more readable by adding comments, even if it is just saying the same thing with other words, just by reducing ambiguity. Comments can also bridge higher level concepts in a good way, e.g. by you explaining in a few lines how a component fits into a concept that is spread out over multiple files etc.
In the end code is text and like regular prosaic text you can both make it harder to understand by not mentioning things or by mentioning too many or the wrong things. This is why it is not irrelevant for programmers to be good and empathic communicators. Sure in the end readability doesn't matter to the computer, but it certainly matters to all people involved.
That is like saying: "A perfectly good road needs no road markings". The point of of good comments is that they make the code faster to read and less ambigous. While good code should indeed already be readable and unambiguos, I have rarely seen code that couldn't be made even easier to understand and faster to parse by writing the appropriate comment.
But of course you will have some individuals who think it is cooler not to, and they are probably the same people who think use after free bugs can be avoided by the shere willpower of the solo-male-genius that they are.
> I was specifically taught that good, readable code could explain itself; that it would make comments redundant.
Good readable code removes the need for comments about what the code does, if the working of the code needs extra explanation then perhaps it is being too clever or overly terse, but there are other classes of comment that the code simply explaining itself can't cover.
Some of my comments cover why the code does what it does, perhaps linking it to a bigger picture. That could be as simple as a link to work ticket(s) which contain (or link to) all the pertinent details, though I prefer to include a few words of explanation too in case the code is separated from whatever system those work items are logged in.
Many comments state why things were not done another way. This can be very similar to “why the code does what it does” but can be more helpful for someone (perhaps your future self) who comes along later thinking about refactoring. These can be negative notes (“considered doing X instead, but that wouldn't work because Y or interaction with Z” – if Y and Z become irrelevant that future coder can consider the alternative, if not you've saved them some time and/or aided their understanding of the bigger picture), helpful notes for future improvement (“X would be more efficient, but more complex and we don't have time to properly test the refactor ATM” or “X would be more efficient but for current use patterns the difference would be too small to warrant spending the time” – the “but” parts are not always stated as they are usually pretty obvious). A comment can also highlight what was intended as a temporary solution to mitigate external problems (“extra work here to account for X not being trapped by Y, consider removing this once that external problem is fixed” or “because X should accept Y from us but currently doesn't”).
Code explains the what but not the why. And even then, the what might not be so clearly obvious.
This is one of those blindspots devs have in that they believe their code to be good and obvious to everyone but in reality it is not even good and obvious to their future selves who will be the ones maintaining that code
IF the code is self-explanatory, then the comments are redundant, and is ok to delete them. But from time to time, there are things that are at least not so obvious in the code. Then is good to leave a comment.
That could be used to see how good a language is for specific tasks. If you need to write lots of comments, maybe you have the wrong language.
Perhaps professor was fed up with over-commenting (comments made up a large part of submitted code), especially if comments were like in https://news.ycombinator.com/item?id=41506466. Unless the course is "practical software engineering" or similar, that good programming practices are a focus, and if the why/why-not parts can contribute to better assessment, an associated paper can be asked.
When writing it inline, I like this approach. I like even better when these things have names. Something like `std.time.s_per_day` or `time_utils.s_per_day`. Then in the one place they're defined, use a pattern like the above to make them easy to reason about.
> I did a survey once in about 3/4 of the comments were either wrong or useless.
> Examples:
> //Add 1 to x
> x+=1;
If 3/4ths of comments are like this, maybe show a sampling of public source code (e.g. from github) that shows how prevalent comments like this are in any real codebase.
I've been programming since 1982 and have never seen this type of "add 1 to x" comment in real code, outside chapter 1 of some intro to programming book.
I would write that comment if it was a long enough list of single line var assignments with enough complex ones that each have a comment like this one.
I will als name each db table column in the correct order by its correct name right above that what decided the value.
For me the quality of comments, somewhat based on the metrics that @renhanxue mentions, is a code smell. If code is poorly commented (by my standards) then I treat the actual code with suspicion.
Yes I write comments like a maniac. Long doc strings that are informally written. It’s more important for me to say what I need someone else (or future me) to know about a function and its context than it is for me to have some beautiful, sterile 300-line autogenerated soulless docstrings.
I write detailed git commit messages like a maniac.
Git commit messages are better than comments, because they tag the specific baseline of code where a decision was made, and you can write multiple paragraphs to explain something (including all the "why not"), without cluttering the program text.
The problem with comments is that they also pertain to a revision that existed around the time they were written, but they stick around, pointing to newer revisions, perhaps falsely. They add clutter. Unless you use a folding editor, comments can separate pieces of code so that you see a smaller window of the program.
One line of code can be touched by many, many commits. Each of those commits should have something to say about that, and all that talk cannot possibly be put into a giant, ever-growing comment next to that line of code. In regard to my previous point, a lot of that talk won't even be relevant to the current version of that line!
I've taken the view that the thing I'm developing is a git repo, not the source tree.
A source tarball is just something for building and deploying, not for development.
If someone wants to understand why something was done, they must use the repo, and not a source tarball. If they insist on just working with the source snapshot, but ask questions that are answerable in the git history, I cannot support them.
I think this is fine if you have buy in within your team. Your concerns around temporarily are valid, though using git over comments is a trade off does make understanding a particular bit of code an exercise in reading backwards until you’ve captured the entire context. It also implies some level of fairly consistent discipline from participating members in terms of structuring commits.
That said, git commit messages as a meaningful source of information aren’t for me, I prefer reading the code as a single pane of information. We alleviate concerns about doc strings and comments becoming out of sync with the code by just reviewing comment salience as part of standard code review.
Either way, I think there’s no wrong answers, if it works for you and your team then that’s good.
It's a lot easier, in a code review, to ask for a detailed commit message that conforms to a certain format, than to say things like, "can you add comments?" Which line or change deserves a comment is subjective.
In my open source projects, I use GNU ChangeLog format in the log messages.
This mentions every file that is touched, and every defined entity (function, type, macro, variable, ...). In a code review, while it would still be difficult to enforce the quality of what is written about these things in the comment due to subjective reasons, there are objective aspects to it. When reviewing, you can point out something like "you changed the initializer of global variable g_foo, but it's not mentioned in the commit message".
Sure, API comment headers can be enforced reviews. But those are not the comments this discussion is about (and elsewhere I identified these as necessary comments that can't be farmed out into the git log messages).
Yes, we should all be kinder to both others and to future me (you). Do we know who will be maintaining this code?
Obi-Wan meme: Of course I know him. He's me!
I've never worked in a company where the commit log message wasn't just a link/reference to something in a bug tracker.
I feel like a 'what this block of code does' comment is different from 'what is this change, and why did I make it' commit message.
I don't write what the code does or even what "I understand the code to do". I explain choices, especially ones that the next developer or my future self is likely to misunderstand when looking at the code.
I would say that I have observed something like this about as many times as someone has changed some semantics in a way that a variable or function name is no longer correct. That is to say: probably a few times in 25 years, but nothing compared to how much value I have received from them.
It could be that too, but I think that presumes an order - that the comment was written after the code. If the comment was written before the code, then it would describe what the author was trying (intended) to do. Which also implies an order of course.
Yeah, absolutely check who wrote a comment before relying on it
It's luckily rare for people to add wrong comments, though (and those who do should be publicly fustigated).
By the way, please never state something as it were the truth if you're not sure that it is. Saying "I think" is perfectly fine, and might save people days of investigation.
This is what I tell people: if you're writing a one line code comment, write a debug log message instead.
There's vanishingly few cases where this extra logging statements will ever be a problem and they can all be handled autonomously if they ever are - but it will save everyone else a ton of time in deployment.
That is, supposing that it's set to saturate at 100 events and they all happen fast enough that it doesn't reset the count. In this case it'll log another batch of 100 on the 201st event.
I think it's important to log the first one immediately, just in case you want to alert on that message. It wouldn't do to wait for the other 99 before raising the alarm.
I wrap my loggers in deduplicators only when I'm about to hand them off to a loop. Otherwise it's the normal ones.
Do you routinely run production services at debug or trace log level?
The point is there's a big difference between "we've got a problem, we need to add logging and redeploy to try and isolate it" versus "we might have a problem, bump the logging level up on that service to see what's going on" (which with the right system you can do without even restarting).
This 'empirical research' is highly doubtful. The first question to ask is what the code with summary comments looked like. While summary comments can sometimes be helpful, this is mostly the case in functions that are relatively long. A question that always arises in that case is whether it is a good idea to split them instead of commenting.
We could have the best of both worlds if comments could be easily hidden, or better yet, just additional meta-data on rich text code. But nope, we can't get away from ascii.
Trouble with comments is that they drift from the code over time because most people do not update the comments - - based upon my surveying production code bases. If they are hidden, it will drift even quicker and become even more useless faster
I would urge developers to err on the side of too many comments over having too few comments, even if there's a risk of them going stale. I can deal with drifting comments, but I can't deal with missing comments.
This is caused by poor or nonexistent code review practices. Reviewers should be ensuring that related comments are updated if code functionality is changed.
i think in sean's proposed world we'd have metadata about that too! the comment in the context it was written in would be available, as well as all of the surrounding changes that potentially invalidate it. as well as potentially a whole discussion thread about what they meant when they wrote it, and suggestions about how to change it.
...and the main reason people don't like comments is because they clutter up the code that gives them the truth of the matter. But yes, if they aren't in your face forcing you to look at them rather than the code, then they are slightly more likely to not be ignored when the code is changed.
It would be nice if they could be like footnotes, or boxed out-takes, that could be pushed to side notes. We have had the typography, even if it was just markdown with a rendered code reading mode.
You don't need to abandon plain text to hide comments. Comments are detectable with a regex; an extension to hide comments would be trivial to make in most editors. I think it's not a common feature because people generally just don't want it.
Rich text code would create so many problems, too. You get locked into a special editor. You need special version control. Grepping becomes difficult. Diffs become difficult. You'd need a whole separate ecosystem, and for what? We have treesitter; we can already treat code like data.
Junior programmers tend to either document nothing, or document everything. With experience you realise that you just want to document the unusual stuff, and as you get more experienced, you realise there is less and less unusual stuff, so the amount of comments drop down.
So, less comments wins out, but faced with a code base without comments you have to inspect it to tell the difference between a beautifully crafted piece of brilliance or a totally flaky codebase written by a bunch of newbie hacks.
> and as you get more experienced, you realise there is less and less unusual stuff, so the amount of comments drop down.
This is a common experience when I use a new language or framework. At the start I comment a lot of things because I think it's useful for future me, but as I learn more I realize that many of the comments are redundant so I end up removing them.
You have to consider who the comments are for. Senior engineers comments are more useful for senior engineers, but the junior comments will be more useful for junior engineers.
I don't know. Senior engineers may understand the point of the code already, and "why it's not written a different way." The comment could be explaining to the less-experienced devs why they shouldn't waste time on another intuitive-at-first-glance approach.
There's also the comments that tell you why the code does something obviously stupid but needs to continue replicating this stupid behavior because something else depends on it behaving this way.
That sounds like "explain[ing] why the code isn't written in another way" where the other way is "the way you are right now thinking it should be as you read this: the way that isn't obviously stupid".
I'm thinking less in "way the code is written" and more "the way the code works".
That is things like cases it doesn't support, or weird results in some cases.
One example is a shop I worked many years ago that had a function which did time conversion on date times without adjusting the date portion. Obviously, objectively, wrong.
However the callers to this function expected this errant behavior. Actually changing it to behave correctly would require a coordinated change. Putting that comment in place prevents a new hire from going "oh this is obviously wrong!" and fixing it, causing an outage.
> A senior engineer writes comments that explain why the code isn't written in another way.
And C-level engineers write a comment "X hours have been wasted on refactoring this code. Should you decide following the example of your priors, please increment this counter."
Sometimes, even what appears to be an utter hack job actually is the best you're gonna get.
I think of comments as an apology to the engineer reading the code.
I'm apologising because the code isn't obvious, or the language not sufficiently expressive, or the good-idea-at-the-time no longer is. Ideally I wouldn't need to write many comments, but I often find myself sorry things are not simpler.
In the end a comment should be like a shortcut that allows people to understand your code faster or a reminder why certain choices were made.
You are right in spirit to say sorry when the code cannot be self-explainatory, but consider that sometimes even if the code is self explainatory, comments can help your reader to see the grand picture faster or to avoid interpersonal ambiguity by allowing you to use two different ways of phrasing a thing.
I find that purpose of comments is to give context to the written code. Sometimes the context explains what the code does, and more often it explains why, but the best comments gives me the reader an insight so that I can have a intuitive feeling for what the author are writing.
Well, so, does this method return the login name of the user ? Or is the "name" field in fact a company convention to return firstname and lastname concatenation ? Oh and if it’s that, what will be the concatenation order ? It doesn’t seems like it’s a parameter but is it configurable somewhere ? Is this automatically defined somehow ? Is it hardcoded ? Or is there a name field in the database and if yes, what does it represent ? Etc …
Within the code you'd have `return this.user.firstName + ' ' + this.user.lastName;` or similar, which can explain the details of how it works when you look into it.
I'm not super opposed to that, honestly. Doc comments are coloured differently from code, and really help you scan through a file quickly to find the next function. It honestly doesn't take more than a couple seconds to write, and (I find) it really helps readability enough to be worth it.
I resonate with that example. It's usually about a piece of code that didn't turn out the way you wanted to due to an issue. If there's a literal github issue or source I can link, I'll include the URL in case someone checks later and it's no longer a blocker.
Somethings I include brief explanations or links to a particular piece of hard-to-find documentation.
> A senior engineer writes comments that explain why the code isn't written in another way
I suggest providing that sort of high-level decision information in a separate design document. This sort of documentation is written before the code and is presented/discussed with management and peers.
That is how it was done at a couple of companies I worked at and it was very effective. Naturally, this was done at the feature level, not on a function to function basis.
Then you end up with a design document that corresponds to the code as it existed conceptually before v1 of that code was even written. By the time v1 is written, the design doc will be slightly out of date; by the time the code reaches v3, the document is more than 2 versions out of date.
The nice thing about comments is that they're in the code. You can't update the code without at least looking at them; that's more visibilty than you'll get from anything else.
Design documents were updated as the code changed, obviously. Changing the code without updating the documents would not pass code review if somebody was shortsighted enough to try.
The kind of design doc that gets presented to management typically isn't checked into the git repo, in my experience. It's a google doc or a wiki page full of planning details, and it gets covered in amendments about whether XYZ will be ready in time and so forth.
Having documentation in your repo describing high-level architecture choices and peculiar quirks for future maintainers is nice, sure, but I think that's a separate document with different goals and relatively little overlap. It shouldn't be organized by feature, for starters.
Also, if the remarks in that document are granular enough to exist as comments, why not just leave them as comments? Then they're more visible and more likely to remain up-to-date.
That still needs discipline though - or you end up with N half-finished Confluence pages describing the intention behind the design, all of which are now out of date (and naturally in completely different places). The best way I've seen to keep track of changing things is to have the design linked to the ticket somehow (and if it's a link, then that needs to be a permalink to something that will not go away in a year's time).
I really have to push back against this "design document" stuff. Unless you're writing some sort of uber-complicated, mission-guidance-systems-level code with multiple standards and audit compliance and and and, then you don't this.
Alternatively, if your org or reach for this feature is so large that you need to communicate and decide its internals with a lot of people and as such require something like a design document, then you've already failed and you have way too-many cooks in your kitchen. Design by committee is always doomed to failure.
Thirdly... if you need a design document to communicate your feature to "management" then that means you don't have autonomy to design this feature as an expert, and again, you have too many "fingers" in your pie.
Does this mean you should go into a basement and design your feature like a hermit? No, but a design document shouldn't be your answer, it should be a better team and process around you, as well as a clear business specification.
> Design by committee is always doomed to failure.
If you need to communicate your design to a lot of people, that's probably because a lot of people are depending on the output of your work, and they want to make sure it suits their needs. That's not design by committee—that's just designing something that will be widely used.
I understood that to mean "I've tried doing it x way and it didn't work because y." rather than the functional part. At that point, why not keep the documentation together with the code?
"A junior engineer writes comments that explain what the code does. A mid-level engineer writes comments that explain why the code does what it does. A senior engineer writes comments that explain why the code isn't written in another way."
(except punchier, of course. I'm not doing the quip justice here)