I started taking readability very seriously once I started going back to extend old code and finding I couldn't immediately understand what it was doing.
Now, if I have that problem, it's now two problems. The original problem, and the readability problem. The readability problem is solved first, and the original problem can only be solved afterward.
I don't see comments helping me, I could spend the time better by making better method names, better local variable names, extracting methods, ensuring that lines do not run off of the screen.
Maintaining my old code got significantly easier after I started doing this.
I'm fixing someone else’s code right now and a few single line comments would have saved my client thousands of dollars.
Today I put in a log statement to see why the code deletes data from the database if there is no new temperature data for the time period. The cron job has been running all day and so far every time it attempts to delete the data, there is no data to delete.
Another line runs a different script if the time is 15:00. No idea what is magical about that time. I added a bunch of log statements to see what happens at 15:00 that is different from every other hour of the day. So far I have no clue.
I’m sure the original coder had a reason for inserting these bits of code, but damned if I know what it was.
There are dozens of instances like this in the code. A one line comment would have saved me hours of work and the client several thousand dollars.
I'll go one better: I've got servers in my machine room that I don't know the purpose of. Literally in some cases the way I've found out what they do is shut them off and wait for someone to complain.
I've seen stuff like that before, typically it's intended to be temporary code put in as a way of troubleshooting or achieving a non-standard result but wasn't cleaned up properly. I see that so often that I simply assume it's the case and not even try to run it down any further. Just make it work properly and move on.
Deleting from any empty database could just be a sanity check. If it's logically supposed to be empty at a given time, it's a perfect time to clean up database errors...
I don't know why people say this type of thing, as if there's some choice you have to make between good names and comments. You can have both, and there are absolutely times when comments are necessary. Too many comments may be a "smell", but code that doesn't require any comments at all is very unlikely.
I think a lot of coders jump to comments as a first resort rather than trying to make the code clear. In theory it's not a tradeoff, but in practice I've seen a lot of heavily-commented code with single-letter variable names. My rule of thumb would something like "never write a comment until you've spent at least 5 minutes trying to make the comment unnecessary".
The problem is the bug catch creep growing around DataInput and mathematical operations.
I think it would really help to have a language, that differs between the original methods raw body (displaying the pure intentions) and blends out the various filters and catchy expeption handling.
Such a function with catchy names, could be allmost as good as a coment- and is usually there.
Coments can be useless too
/* Function taking these arguments, returning this type */
Comments are not for how or what, the code does that just fine.
Comments are for why.
Sometimes the why is obvious, then you don't need a comment. The rest of the time, add that comment. Even if it's something like "Steve in accounting asked me to put this in."
Good comments do frequently answer the why question and, sometimes, the non-obvious what.
It's generally true that the code expresses the what, but it's also true that it can take the reader time to discern. A simple comment here and there can be a shortcut to this discernment which, over thousands of lines of code can save serious time.
Such comments are best added in commit messages. If you add JIRA task number to each commit it's almost always already there (because in the task there will be your discussion with Steve).
And when you change the line because Tom asked you to change it again - you don't need to remember to remove the comment about Steve.
I have recently started working on an existing project which is all new to me. Having to go through git commits looking for relevant comments is just a silly suggestion. Its far more useful when I am browsing through the code, trying to understand it, to see a comment beside the relevant piece of code, not hidden away in git commit messages.
IMHO comments aren't there for newcomers. You are a newcomer for a month or 2. You are a developer for years most often. Besides, how is
//Steve in accounting asked me to put this in
foobifyTheBaz(bazPtr, foobificationParams);
more helpful for newcomers than just
foobifyTheBaz(bazPtr, foobificationParams);
Comments answering "why" are very important when bugfixing/changing stuff. You want to know if sth was intentional or just accidental, and if it was part of the change that you have to override, or some independent change, so you know which behaviour should stay, and which should change. You should always look at the git blame of the region you change anyway before making a change (it often turns out your change was there before and commit message has the reason why you shouldn't change it back).
And you don't look for git commits. You enable git blame annotations in your IDE and hover over the relevant lines to immediately know who, when, and why changed that particular line. That's why commit messages are as important as good naming IMHO.
By the way, when you have code like:
//Steve in accounting asked me to put this in
foobifyTheBaz(bazPtr, foobificationParams);
barifyTheBaz(bazPtr, barificationParams);
if (mu(bazPtr)) {
rebazifyTheBaz(bazPtr);
}
How do you know which lines the comment refers to and if you should delete it or not when Tom asked you to change barifyTheBaz invocation? That's the main advantage of commit messages over comments - they are always fresh and encode the exact lines they refer to.
// We foobify the baz in order to ensure that both
// sides of the transaction have their grinks froddled.
// This was a request from Steve in Accounting;
// see issue #4125 for more details.
so that (1) if you'd otherwise be wondering "wait, what they hell are they doing that for?", you get an answer (and, importantly, some inkling of what would need to have changed for removing the code to be a good idea), (2) there's an indication of where you can find more details (hopefully including the original request from Steve in Accounting), and (3) if the code around here changes, you can still tell that the relevant bit is the foobification of the baz.
I strongly approve of putting relevant info in your VCS commits too, of course. But, e.g., if someone changes the indentation around there then your git blame will show only the most recent change, which isn't the one you want, whereas the comment will hopefully survive intact.
Changed formatting isn't a problem IMHO. You can ask git to skip whitespace changes, and you can (and should) enforce consistent formatting anyway to make history cleaner. And even if for some reason you don't want to do either - you can just click "blame previous revision" if you do encounter "changed formatting" revision.
I envountered it a few times and it was never a big problem.
The problem with comments in the code is - they have to be maintained "by hand", and they are often separated from the context after a few independent changes.
When you change a function called inside the foobify function because of another change request you will most probably forget to check all the calling places all the way up the callstack and fix the comments refering to foobify function. And then comments may start to lie.
I encountered lying comments a few times and it usually is a big problem. I started to ignore comments when debugging, and I'm not the only programmer that I know that does that.
I do think there is a place for comments in the code, for example for documentation of some API, but IMHO commit messages are the perfect place for explaining reasons of particular change, and I prefer not to repeat that.
In which case it's one click away. And anyway you should enable checkstyle and autoformat on saving files, and you can use "ignore whitespace changes" in git view for legacy code.
While I don't agree with the general sentiment of vinceguidry (e.g. I think comments answering "why?" are very important), for your specific counterpoints I do tend try solve it in code if possible:
2) Vector3 unitVectorWithDirection(Vector3 originalVector_mayBeZero); vs. Vector3 unitVectorWithDirection(Vector3 originalVector_throwsIfZero); (if in a supporting language, throws declaration would also clarify).
3) float ArcTan(float x_throwsIfZero); or float ArcTan(float x_returnsNaNIfNot0to1), etc.
This assumes:
(a) You're not working in a language or environment that supports range constraints in the first place, cause if you are then that's ideal.
(b) You're working on personal code or a small team. If you're writing code for public consumption you have no choice but to add detailed API documentation if you want to be successful, even if it's not DRY.
The sibling comment about renaming is good - particularly with regard to the first function. However I think the other two are better suited to documentation rather than trying to make the function signature explain itself. Sooner or later you'll just remember that it takes a float, not the variable name.
Certainly calling things like "x_returnsNaNIfNot0to1" works, but I find it a bit ugly and it gets complicated if you have multiple or more complex constraints.
This is where languages with docstrings are nice. In Python all I would do is add a """ comment describing the inputs and the return values.
You then have a) a comment describing the code; and b) documentation that's standard so people can pull it up with pydoc or ? in Ipython. When I'm working in Jupyter, I often hit shift-tab to check what a function is expecting.
Was just looking at the Oculus SDK math library, and they have some really neat ideas...
For one, handling rotation sign (is positive clockwise or counter-clockwise), left-handed vs. right-handed coordinates, etc. through c++ template parameters.
I prefer to put that kind of stuff in the javadocs/xmldocs, but I mostly work in C#/Java, with IDEs that have tooling baked in that make generating that kind of documentation A.) really easy to generate and B.) very useful in auto-complete lists, mouse-over popups, etc.
Ideally, you'd have some tests as well that would test those kind of corner-cases and illustrate failure cases and expected outputs.
When you are using a particular method that is required by the context of your work rather than code you have control over. Perhaps the method name in the framework you are using is non obvious, or there is a bug in the way it works.
Sometimes comments are just to save future you some time or help other developers find context.
Suffice to say any project of significant complexity will probably require comments at some point. That complexity can come from the code, the task, the stakeholders or the dependencies.
Unless you are happy with a method name like getSimpleProductInstanceThroughCoreFactoryBecause ModuleRewriteWillBreakCompatibilityWithSecurity PatchSUPEE5523 ()
In that case I guess you're right.
Edited to stop the method name stretching HN layout.
That introduces unnecessary complexity in the code for something that could just be explained in a comment.
There were no fixes performed in the method, the method just gets a product instance from a factory. The comment gives reasons for why it was using a core factory directly instead of using the automatic object resolution mechamisms.
The real method name was getProductInstanceFromCoreFactory. Still long, still clear as to what it is doing. But making the context clear would be more code than it is worth.
Avoiding comments by writing clearer code isn't a bad habit, but my point is that they are very useful to provide context for something that getting context for would otherwise be erroneous or cumbersome to implement.
A good marker for when comments are going to be useful is when something stumps you, and then you fix it and/or figure it out. Chances are, when you read the code again it will stump you again - or on re-reading you might not even realize there was an issue. So I comment the change. It's usually something simple like, "This cast avoids a special case in method X", or along those lines. The comment is the "why" where the code is the "what".
If I run into that situation, I refactor the code so that I can better understand it the next time. To not do so is to waste all that time you spent understanding it.
> It's usually something simple like, "This cast avoids a special case in method X", or along those lines.
If I had an issue like that, I'd fix method X to be more accepting of unclean inputs.
The problem is that method X is a remote invocation into another system that has different change-control procedures, a different ticketing system, and a different release schedule.
I'd use a gateway class and intention-revealing method names, even if all that method was doing was casting a value. I'd call it ".edge_case_fix", (but described better than "edge case") and do the same for every weirdness in the external system that requires workarounds.
I find one place where comments are absolutely critical is in external facing APIs. Summary of the purpose of the API, purpose of each method, valid arguments and possible return values. Think how often you need to read the comments for whatever libraries and APIs you use in your work.
I deal with two types of APIs. Well-documented and otherwise. For well-documented APIs, I can simply re-read the documentation to figure out the other side of my code.
For the rest of it, I try to write as cleanly as possible, and as robustly as possible the first time, so I'm not in the dark the next time I have to look at it. If I have to take time out to re-understand how it works, I'll just take the time and not feel guilty about it. The company wanted me to use that dreck, I'm not going to feel bad about having to take more time with it.
I don't think comments would help much in the second situation given decent engineering the first time around. If it's my code, I'll usually recall the idiocy I had to work around when running down whatever issue had me looking at it again.
Comments are incredibly useful for that bit of code that you spent hours trying to make it work and you couldn't figure out a way to name things well or to improve it. That's where a comment is priceless.
The downside of course is that comments aren't compiled/run - so they often become out of date. "Often" might actually be an understatement.
I can't tell you how many times I've been reading a comment that runs quite contrary to what the code really does. You then end up reading the code to reason about it anyway, paying two taxes. After too many of those experiences, you just end up going straight to the code as the source of ground-truth.
This all depends on the readability of the code / and how soon these comments get out of sync. Maybe my experience is atypical, but I doubt it.
Names for classes, methods, parameters, and variables all suffer from this same problem. I think the solution to align code and comments is code reviews. Not writing comments also works but fails to solve related issues.
Where I work, I've always required that code that's submitted for review has a descriptive commit message. That is, the commit should explain what the problem was and how it was fixed, and why it was done that way.
This way, I can run git blame (or the equivalent VCS command) on the file and get a reasonable set of comments that usually match up with the code.
Maybe it's because I write in Ruby, and so never have to do any tricky optimizing, but if solving a particular problem starts to run over a half hour, I look to re-architect the project, either by reaching for a gem or telling my boss that X is too hard and we should do Y instead. My patience for going down rabbit holes has mostly gone over the last year.
Also, again perhaps because I use Ruby, it never takes me more than a few seconds to name something. If it does, then it's a code smell and I go looking for the missing class.
I don't see how Ruby saves you from optimising (to my limited knowledge, it's not exactly a fast language), but I agree with you that naming stuff sensibly and extracting functions (which you can then name sensibly) is most important for maintainability and can make "in-code" comments unnecessary in many cases. However I strive to always document what a function does if its not obvious -- though I'd call that "documentation" and not "comment".
Example for obvious: Int add(Int x, Int y) in a typed language. Example for not obvious: add(x, y) in an untyped (or "dynamically typed") language (Does it auto-coerce? How? Can it add complex numbers? In my particular representation? ...).
Someone mentioned that sometimes comments are useful e.g. to document a quirk/bug in library function you call, and I have some examples of that in my own code. But most often you should be able to rectify that by wrapping said function in your own one that omits the problem.
If a function seems impossible to give it a sensible name that isn't ridiculously long, split it up. The more clear code of the individual parts and how they are combined should give a hint of what is actually computed here. Maybe it's a new concept in your business logic, in which case providing a clear and exact definition makes sense anyway (put it into the appropriate place in your documentation). This whole procedure can take a significant amount of time, but it will be worth it in maintainance!
> I don't see how Ruby saves you from optimising (to my limited knowledge, it's not exactly a fast language)
That's precisely why you don't optimize. If you find you need fast code, you use a different language. Ruby is the language you use when maintainability and extendability take priority over speed. It's excellent for web development, where any speed improvements you make will ultimately be dwarfed by network latency.
> Example for obvious: Int add(Int x, Int y) in a typed language. Example for not obvious: add(x, y) in an untyped (or "dynamically typed") language (Does it auto-coerce? How? Can it add complex numbers? In my particular representation? ...).
That's a good observation, and it's made me think about how I deal with this in Ruby. First, generally you can tell by looking at a method's code what it's expecting you to pass to it.
Second, you don't generally pass around complex types to library functions, you use basic Ruby value objects like strings, symbols, hashes, and arrays. A gem will often define its own classes, which you might pass objects of around, (money, phone numbers) these will often be the primary focus of the gem, and how to use these objects will be written right there in the documentation. These classes will typically have "parse" functions that will take random input and turn it into a more useful object.
In Ruby, you generally only pass around complex objects in your own code, using JSON or some other format to interact with external systems
> That's precisely why you don't optimize. If you find you need fast code, you use a different language.
Ah, now I get you :)
> First, generally you can tell by looking at a method's code what it's expecting you to pass to it.
Here we might differ, I would always prefer to state clearly in the function doc what types of parameter values are allowed. For example, even if you have a really simple little wrapper in javascript:
function log(x) {console.log(x);}
Without documentation, you have to know what console.log can do for different types. So I'd definitely prefer this:
/**
* Logs x to console.
* @param x a value of a primitive type (other types are not guaranteed to be logged in a readable manner).
*/
function log(x) {console.log(x);}
> A gem will often define its own classes, which you might pass objects of around, (money, phone numbers) these will often be the primary focus of the gem, and how to use these objects will be written right there in the documentation.
Yes exactly, it will be documented as any public API should be. I have no problems using opaque types. But a function call(x) which expects x to be some object representation and not any old string (for which the library has constructors, e.g. PhoneNumber(string)) should surely document this, no?
In my book it goes 1. types 2. tests 3. method/variable names 4. comments. If you can't make the code clear any other way then add a comment, but it should be a last resort.
Now, if I have that problem, it's now two problems. The original problem, and the readability problem. The readability problem is solved first, and the original problem can only be solved afterward.
I don't see comments helping me, I could spend the time better by making better method names, better local variable names, extracting methods, ensuring that lines do not run off of the screen.
Maintaining my old code got significantly easier after I started doing this.