What separates good code from great is emphatically not style. It's the fundamental architecture. The omniscient vision to create a program that is somehow much more maintainable, much less likely to have bugs, or much more extensible than dozens of other potential solutions that any competent programmer would agree are "good". Great code can come from a combination of repeated iteration and deep thought, or it can be serendipitous, only emerging as great through the test of time. It's not only great programmers that write great code, and being a great programmer is no guarantee of producing great code consistently. It arises at an ideal intersection of mathematics, cognition, and utility. I hope to write some, someday.
Yeah, when I was a less experienced programmer, I spent a lot of time formatting code. I made everything ultra-nice looking.
But you can only get 10% more readability this way. Any style is readable as long as it is consistent. You get used to reading non-aligned code, and the advantage of faster editing/diffs makes it better IMO to stop aligning it (I used to do that).
The real difference comes when you learn to break down your problems correctly, and control dependencies. Another way of saying this is to make the program structure matches the problem, rather than manually compiling lots of irrelevant details.
Part of the benefit is that you literally will have fewer lines to read. There will be fewer lines of code, and you will have to read and understand a smaller portion of the code to make a given change.
Code formatting is just a small part of maintainability. I agree this article seems "small".
Failure to use proper spelling in code is a cultural disease. Misspellings in English are looked down upon so why do we put up with it in code? Well I don't.
The authors code examples go back and forth between spelled correctly and not spelled correctly so I'm not quite sure what hes advocating. Figure 13 is especially hilarious.
pload: I have no idea what 'p' is
qload: I have no idea what 'q' is
numBuses: should have been busCount probably
numTransLines: what is 'trans' ?
systemName: ah a good one!
gens: should have been generators but even in that case its basically useless. I already know its a vector of Generators, what is its higher level function?
transLines and buses: same as 'gens', just a name of the type.
Take any book you haven't read that is openly available online. Remove all vowels in worlds longer than 4 characters. Replace commonly used words (like 'the') with your favorite abbreviation (like 'T'). Truncate extra long words (9+ characters). Then try to read it. Horrible, isn't it?
Reading abbreviated (or worse, one-letter) variable names is like translating from a made-up language.
As a Teaching Assistant for a first year computer science course, I have actually been using the example of writing an essay and keeping within a page limit by randomly cutting vowels out of words and thinking up ways to abbreviate words which aren't meant to be abbreviated.
Everyone quickly agrees that it would be ludicrous to do that. By stressing over and over, "try to write code which anyone could understand at a glance", I think they're getting it. Glad to see some like-minded thoughts!
Even worse than having to read weird abbreviated variable names is having to write them. You know the variable you want to write is something to the effect of "the number of buses", but then you can't remember if it's numBuses, nBus, nBuses, num_buses, or God knows what. It gets even worse when the original author starts dropping random vowels, so you have to guess nmBuses or nmBs or something horrible like that. Who knows what letters they decided to omit?
Much easier to just write out a clear, concise variable name with no abbreviations (other than those that are extremely obvious and common). Even if the variable names are a little longer, your text editor will autocomplete them (right?).
A standard argument against aligning multiple similar lines of declarations as a table is that you constantly need to re-align when you add/remove declarations. One implication is that source diffs become larger than necessary as they will now include lines that haven't actually changed content-wise, only layout-wise.
In my younger years, I was extremely meticulous about tabulating anything in my code that was remotely tabular, but these days I pay less attention to that. Constantly re-aligning is a pain, as you mentioned. Another thing I've found is that it forces you to spend brainpower on decisions that are ultimately pointless. For instance, if the code is mostly tabular, but one line has an extra column, what do you do? Stick that column into an existing column? Add another column that will look weird with that one entry hanging? Etc.
Another really irritating result of lining everything up just so is that when you do a search+replace on a bunch of code, and change the length of an identifier. Suddenly you've screwed up the formatting on a bunch of tables without even knowing it! This is especially annoying with doing a search+replace across multiple files, with lots of code.
That's not to say that lining things up sometimes is a bad thing, particularly when you really have a table (e.g. initializing a big array of structs). Sometimes having things lined up makes it easier to edit later (e.g. using Vim's visual mode). But I'd say that 90% of the time, it's a waste of effort that could be spent, say, making your code actually work better, rather than look better.
Have you checked uncrustify? It doesn't do all the work necessary for the tabular alignment illustrated in the article but it does great part of it. I personally use it with a keyboard shortcut to fix the selection of the code I just wrote or edite. In this way I'm faster because I skip most of the white space characters.
In emacs, one can take out the drudgery of manually aligning and realigning your variables in tabular fashion using the align commands[1].
In certain revision control systems, one doesn't keep differences, but whole objects, so the point is moot there unless you have some serious bandwidth concerns.
It's not only about bandwidth concerns, it's more about making commit diffs readable. If you have ten lines of tabular assignments and you changed the sixth, but that made you realign everything, then you'll see 10 green lines + 10 pink lines in your commit diff, even though you should only see two lines in total.
Forgive my bluntness, but why is ACM publishing a couple of academics' personal choice of lightweight coding standard, and why are so many Hackers upvoting it? Is there some hidden depth here that I'm missing, or some underlying research that shows these preferences are objectively better than some of the alternatives?
Sometimes it's useful to have a new way to think about the same boring old problem. Personally, I like this set of guidelines because it focuses on things that make code more readable and pleasant to work with instead of getting bogged down in minutiae like where braces go or whether or how I have to import things.
This reminds me of a blog post I read several years ago where the author connected the concision of code to the experience the author of the code had. More experienced coders needed less comments and could read higher-density code with less mental effort.
Wish I could find that blog post again, I've thought of it from time to time over the years.
edit: speeling.
edit2: Yegge strikes again. Wow. Amazingly influential guy for me.
This is interesting, and not something I'd consciously thought about before, but quite often I will go back after I've finished something and add some white space and (sometimes quite thorough) comments to the code I've written. I especially do this with open source code that I plan to release.
My reason for doing this is that I want to make it clear to whoever ends up maintaining the code after I've left the project. I have no idea what their skill level might be and getting up to speed on someone else's code can be tedious. Apparently this also makes me look like a less-skilled developer?
I've spent quite a bit of time working on systems that became untenable and had to be partially re-written or completely re-written. The result is the original code was either a baseline or a reference. I will say, without a doubt, comments are a code smell. Any comments.
If you write a piece of code and think you need to comment it because it won't be clear for the next guy. It is good at best.
Furthermore, comments rarely help. They are written while you are in complete comprehension of the program or that part of it. The next person (assuming the comment is something they are using to figure it out) won't be.
That said, if you can do nothing to improve the quality because of time, lack of interest or necessary complexity. At least, make your best effort to comment what it does. Appreciating the fact that this is a minimal quality improvement action. Sometimes it is just a piece of code that has to be optimized, explain why so the next guy doesn't refactor/re-write it.
Personally, I think (and judging from the reactions of the people I work with, they seem to agree) that my code is solid; efficient, well organized, logical, descriptive, but not too verbose, etc.
Your comment has made me go back and re-evaluate some things, though, and look over some of those comments - and it was at that point that I realized that a lot of them are superfluous and serve no real purpose other than to restate what's already clear in the code. Others in the particular code I was looking at were about things like UI customization and how one might extend the existing code in order to do that, which more correctly belongs in the documentation (and would have wound up there eventually).
Again, most of this had been in open source PHP/Ruby/JavaScript code, where I'd assumed that the average tinkerer wouldn't be as well-versed in the language, but now that I think about it, if they're not, they need to learn before tinkering, rather than me trying to "dumb it down" to their level, which probably just invites disaster on multiple fronts.
So in my attempts to make things tidy and easy for the next person to come along, I was instead just injecting garbage that didn't need to be there.
Which kind of sucks to realize, but on the other hand, I suppose it's better than realizing you've been commenting the hell out of everything because the code actually doesn't make sense without the comments.
I have done research programming into new algorithms in a pretty exotic area. Those sorts of algorithms are not comprehensible without documentation, ie, comments.
OTOH, outside of that, I would agree that comments have a limited use.
As a corollary, I can tell a code review session has reached its end when we start debating coding style, which I find happens about 2 minutes after we agree to talk about coding style at all. I've never found the intricacies of variable naming helpful, though I've participating in a number of discussions on the subject.
Code style guidelines are just that: guidelines. Being able to work in a group with a broad range of coders (from inexperienced to hyper-genius) is a highly useful skill for a coder.
Being able to communicate about the project to management, customers, and online is an even greater skill.
What's wrong with Code Complete? You might pick a couple of nits, but these things have to be explained somewhere, and Steve McConnell does a good job there.
With regards to figure 1 I'd argue that avoiding duplication is far more important than alignment. When I first looked at figure 1 I paused to see if duplicating the case was what was meant or whether each line should match lower case and upper case. To me that signifies less readable code since I've had to re-read it to try to understand it.
It amuses me it goes straight from making the point that tabular layouts are clear to a rather confusing table in Figure 4 where what should be a header looks like normal text.