Many years ago in an age of klocs and flowcharts, at a large three-letter computer company, there were iron-clad coding rules that must be obeyed, no questions allowed. In general you could see their reasoning, but such bureaucratic reasoning doesn't pay off.
One example of this was that all numeric values used in a program must be factored out as symbolic constants. The reason for doing this is obvious, but it failed to account for the fact that some numbers have intrinsic meaning that should allow their use directly. But to be compliant with the standard, our C code had this boilerplate up near the top:
#define zero 0
#define one 1
This, of course, only served to make every page of code harder to read. And it didn't really even solve the problem it was meant to. We once found in some source code:
#define thirteen 13
which of course did nothing to show us why that particular value was relevant.
As it happens, I know an end user of this particular beast, so I show her some record IDs of each type and ask in what way they differ. She looks a few seconds, then says: 'This is clearly a type one record, the next is a type two, and here you have a type three'. After a little back an forth, it turns out even our internal course for new personnel talks about 'records' with different rules to follow for 'type one/two/three'. The end users have no other word to describe these entities than 'records' each with its own numbered type, and think about it as completely natural. The application is based on an older COBOL application, and the terminology just stuck around.
Clearly, the application is well-written: The enum uses the terminology used by the business. Another of my flabbers just got ghasted.
This part looks like those people formed a neural network. They are able to learn to act quasi-intelligently, but when you peek inside, the internal model is just a bunch of numbers.
Edit: scratch that neural network. They just made one neuron so far.
Sometimes people just can't think of a better term and it sticks. It makes sense if there's some ordering implied but in quite a few of these there's none:
One benefit there is that you don't get left with only context for what type one records used to be like before some evolving. The only baggage kept is by people aliasing that name in their own memory.
You don't get screwed because you called something apple to start with, but it now represents a race car
Or, just program in INTERCAL (https://en.wikipedia.org/wiki/INTERCAL)! Integers are input in English (e.g. THREE FIVE NINE) and output in Roman numerals.
Using NULL for the integer 0 requires a cast in C (because NULL may have a pointer type). That doesn't break the GP's rule, however neither does `!I` or `I-I`.
For extra fun, put it into the code like this, including the `...` instead of the actual values. (Make it compile by commenting that line out.) Then define IV somewhere else but make sure to use III before every use IV so someone unfamiliar with these constants see the “right” definitions first.
When formulating the rule, they didn't understand the point of it. They think "naked constants === bad" is equivalent to "no numbers in source".
Which, of course, is wrong. They're not the same. Same bastardization happened to Hungarian Notation, IIRC. It was meant to add some information to variables, like `iterN` for a counter, or `lenX` for a length, but someone somewhere decided that you were meant to prefix with the most primitive of type info, so you get stupid shit like `intN` and `intX`, and aren't much more informed. (Especially with an IDE showing the type info)
It's "obvious" if you take into account the corporate culture I guess. The rule of 'no magic numbers' is generally good, but then one has to remember that some numbers (like 0 and 1) are not magic.
Let me tell you about the system where null and 0 are treated as equal because in some programming languages will coerce null to 0. It just so happens that for an important business type in this system:
the range of the type is the non-negative integers up to (2^32)-1
semantically, the value cannot be null
when the value is 0, the code treats it as a special case
is just bad programming, barring some philosophical code which cares about "zero-ness". Names need to reflect what they represent, not their precise values. Are we looking up the first index? Then how about
#define first_index 0
? Are we summing up values which are coerced to zero when empty? Then how about
#define empty_value_integer 0
or
#define additive_identity 0
?
Yes, that means there might be more than one variable with a value of zero, and that's completely fine since we now have several megabytes available for both source code and for compilation.
To be clear, I'm not endorsing the rule, just saying that there are more productive ways of dealing with it.
>barring some philosophical code which cares about "zero-ness"
But that is the intrinsic meaning OP was talking about. Plenty of mathmatics deals with 0 or 1 as important constants. If you have a function that mods by a parameter, you need error handling for zero. Calling it value_that_causes_undefined_behaviour isn't going to improve code clarity. Beyond that;
My point is, if you had to pull `0` in that `for` loop into a variable, would you call it `zero` or `first_index` (or something else actually relevant to the loop)? I'd certainly prefer something relevant to the loop rather than `zero`.
'zero' and '0' are the same label, for the digit 0. There is no advantage to re-defining it, in fact, it does the opposite. It's a good practice to define labels for magic numbers to promote more maintainable code through the DRY / single source of truth principle, document the magic number through the label name, and improve code archeology by enabling searching by label.
These principles hold when the label given a clear and concise name--which is an art.
We have something similar in my my organization's Fortran code. We have constants KI1, KI2, KI3, ... etc corresponding to (k)onstant integers 1, 2, 3, ... etc. Someone got the idea that hard-coded numbers were bad. All it does is make code harder to read to anyone unfamiliar with these conventions.
Did you have an automatic static code analysis tool to flag non-compliance? This is the sort of thing I see and use to warn people away from blindly spending money on this sort of thing, installing the tool, and calling it good.
705 /*
706 * Iterate through the groups which hold BACKUP superblock/GDT copies in an
707 * ext4 filesystem. The counters should be initialized to 1, 5, and 7 before
708 * calling this for the first time. In a sparse filesystem it will be the
709 * sequence of powers of 3, 5, and 7: 1, 3, 5, 7, 9, 25, 27, 49, 81, ...
710 * For a non-sparse filesystem it will be every group: 1, 2, 3, 4, ...
711 */
712 static unsigned ext4_list_backups(struct super_block *sb, unsigned *three,
713 unsigned *five, unsigned *seven)
To me, that is a 20 foot tall neon orange flashing Chesterton's fence. I think, "surely there is a very strong reason for this." Maybe not a good reason, but definitely a compelling one.
If I was code reviewing this, I would almost certainly insist on a different name to make the WTF go away. I’m not going to suggest one here despite having read the other code comment, as doing so would be bikeshedding — not my codebase and I have no skin in this — the only point I will insist on is that I cannot believe that “three” is a sensible choice of name for whatever it does. Not even what the comment implies it does.
Totally agree, it would definitely be better to rename it, but I read in another comment that the kernel is loathe to rename things. So then it makes you wonder, who allowed 'number_noun = N' in the first place?
"Three is the number of the counting, and the number thou shalt count to is three"
Why my kids ask me questions like this, "who invented the number three," I usually tell them it was someone vaguely European of the Enlightenment era with the same last name. "Ah, it was Ulysses Herman Three, inventor the natural numbers and famed bicyclist." Usually buys me enough time to look it up online.
> To me, that is a 20 foot tall neon orange flashing Chesterton's fence.
It's not a Chesterton's fence, though. The whole point of Chesterton's fence is that whoever put up the fence that you want to take down couldn't be bothered to add signage explaining why the fence was present, thus either it isn't important to keep the fence up long term, or they're a idiot, in which case there's no reason to belive they were correct in deciding that having a fence there was useful in the first place.
With this fence, if you walk a few meters to the side (read, "select 'ext4_list_backups' and hit ^F and ^G a few times); et viola, there's your signs.
'Without digging deeper' isn't really what matters here. A better question would be 'after some very cursory further digging, are we worried?' The first thing I ask myself when I see something that looks as wrong as this does is "Is it possible for this piece of code to be completely broken, and for no one to have noticed yet?"
This is really a question about two processes - the one where people look at, read, edit and discuss the code, and the extent to which this is discoverable by them. And the one where people run the code and expect certain things to work and would notice if they didn't work, or didn't work correctly.
This fails both of those checks. This file has been edited 8 times in the last year, by 5 different people. It's not so long that several of them wouldn't have read through the whole thing. This bug, if it was a bug, is unlikely to have persisted for 6 years without being noticed. It's not obfuscated or deeply technical in any way.
Secondly, this is code which is fairly critical to millions of systems. Ext4s is in literally hundreds of millions of devices. It's used in data centers in contexts where people are likely to follow up on any unusual behavior by their filesystem. The code in question is likely to be very well exercised in tests. It's likely to be extremely well exercised by people doing research stress testing/comparison testing of filesystem code. It is undoubtedly very well exercised by real-world deployments.
The bar to assume that you know something the person who wrote the code didn't, as opposed to the other way around, should be very high here. Even taking all the benefit of the doubt (which you should always do, never assume that stuff 'probably works' without evidence) it's not hard to dismiss this.
> The first thing I ask myself when I see something that looks as wrong as this does is "Is it possible for this piece of code to be completely broken, and for no one to have noticed yet?"
Good reflex. In my first job out of school (big research lab), we would occasionally find odd bugs in the code base (as would be expected...bugs happen). The head of the project would often wonder, “how could the system ever have functioned with this bug? Someone must have deliberately tampered with the source to discredit us.”
The first few times I thought he was joking, but no: he seriously believed this and even tried to investigate.
This was years before source control, which would have answered the question.
Three years ago, I was sent as a consultant to help a struggling small, local health insurance agency. They had a three man team, the lead had been there for fifteen or twenty years (built their system). He assured me on the first day that they didn't really have any bugs and that the system was pretty stable.
We were called in to help them because their management (and users) were aware they were underwater with the amount of bug fixes, live production data changes they made to keep things working, and impossibility of adding anything new to the system without tipping something else over.
It seems the attitude that "there's no chance we have any bugs" is pretty prevalent anywhere I've ever been, along with the usual mix of "how did this ever work" which I still always find pretty shocking to run across.
The problem with this argument is that this particular snippet of code doesn't get run very often. It's the bit that works out where the superblock (and all its backups) are stored. Now, normally, only the one copy of the superblock is ever read - the backups are only read if the first copy of it is corrupted in some way. So almost all the time, the results of this code are not used. So if it were broken, then yes it could possibly go unnoticed for 6 years.
That's only a problem with one of the arguments, and is exactly why there are two approaches. Some errors are hard to detect by reading or working on the code, but easier to detect from behavior. Some are the other way around.
In this case, I do not agree that this could have gone unnoticed for a long time, even only looking at the behavior of the running system. When multiplied by the number of installations and how much filesystems are used, it is run a lot. It's also the kind of thing people investigate when it happens in certain settings - we lost a block and our whole filesystem was corrupted. And the kind of thing people test for.
It’s a local variable so it seems that there would be little impact in renaming it (and the other two). But if the local (kernel in this case) coding conventions discourage variable renaming, a brief comment referring to the later comment would be worthwhile.
Over commenting is a problem, but this would a good use case if renaming is not appropriate.
Not really. Nobody sane would think other sane developer with any expeirience would call a simple variable according to what it initially contains. You'd expect either name related to what is the meaning of the contents, or meaningless name.
I would never ascribe to previous programmer (unless he just learned what variables are) the intention of creating variable (not even constant, which wouldn't make it any better) named 'three' and assigning actual value 3 to it.
All three lines would be "insane" if the variables weren't mutated. As in a variable named "three" is insane if it solely contains either of the values 1 or 3.
When it contains other values it is still subpar variable naming.
If something looks insane, but probablu sane person made it and it works then that's usually good indicator that you are missing part of the picture and should look around. It still might be bad, but probably not insane.
+1 seen it so many times, often times from some cut and paste heavy refactor, one is more or less going on autopilot just moving substrings around and then bam. Review process hopefully catches it, when it is a bug, and potentially when the semantics are unclear...
The reasonable conclusion is that you should read the surrounding context and try to understand what these variables are used for. The purpose of code review is to understand the code enough that you can spot real bugs; I try to avoid flagging surface-level stuff unless it's really important. IMO, this is a borderline case due to the very clear comment (which is a bit too far away).
We have a difference of opinion here. I lean away from bikeshedding in favor of finding more substantive issues. I'm also an old-timer and I'd call this a 3, on a scale from 1 (easily readable code) to 10 (a suitable IOCCC entry)
It's also true that unless I had a specific mandate to touch that code, I wouldn't. If nothing else it confuses the commit history, obscuring what I was actually in there to do.
But if it were my job to touch these lines of code, I would want to leave them better than I found them.
That there are some entities (maybe ordered, maybe related to numbers three, five and seven in some way) and there are some values associated with those entities, two of which (accidentally or not) need to be initially set to corresponding number, but the third one possibly not, but better look around in the comments or code to understand and make sure the code is correct.
Then maybe leave a comment right after `= 1;` why this variable shouldn't be initialized with 3 that would clear up confusion for you if it was there.
Magic numbers are a horrible pattern (and I don't mean elf/obj IDs).
*ANY* time you use a non-obvious constant (like start of a loop that isn't 0), good practice to comment immediately at the source, or pull it into a well-named variable/macro.
Don't agree? Spend 35 years programming and you'll see this pop up again and again and again and again... SO much time wasted figuring out magic #s.
This is nicer example, since the posted one should really have different variable names (e.g. pow3). On the other hand, if this declaration of ONE is global (not contained within some context like a type), then perhaps it should be something like ONE_F8.
I don't know about the ext4 code specifically, but in hardware programming it is common to have stuff like "3 means 5". If for whatever reason a field of a register may only accept values that actually mean 1, 5 and 7 (to program whatever bits of the hardware, like clock divisors or whatnot), the hw register will probably only contain 2 bits, and be encoded in such a way where if you write 00b you mean 1d, if you write 01b you mean 5d and if you write 10d you mean 7. And 11b is reserved. This is a common thing, it's encoding to pack a sparse set of possible values into a few bits as you can.
But then, you should probably do something such as "#define CLOCK_DIVISOR_3 0x1" instead of simply "int three = 1".
It was done that way on purpose to cause hundreds of people a year to freak out and discuss it ad nauseum. But really a social experiment designed to show everyone is all talk and no follow through... That being that no one actually tried to change open source but just talk about it.
It's actually worse than that, given that all 3, 5 and 7 have 1 as their power-0, and the rest seem to encode powers of the former. Too much WTF and not enough comments.
to me it is a great message that if you have attitude that your predecessors were stupid morons and you don't understand what happens here nor willing to bother to dig enough deep to understand it, you shouldn't be doing changes here. Attempt to correct it is a kind of warning signal for a PR containing the correction.
man, that hurts. When you found the issue were you happy or were you just depressed? I've been there and sometimes finding the root of an issue is not a eureka moment but more just a long, deflating, sigh.
I was grumpy. Sometimes people are inexperienced and don't know how to make simple clean software and so end up with subtle bugs. I can understand that and dealing with it is part of life as an engineer. This was just laziness and there's no excuse for it.
"Write software like the next person to work on it is a psychopath and they know where you live"
One example of this was that all numeric values used in a program must be factored out as symbolic constants. The reason for doing this is obvious, but it failed to account for the fact that some numbers have intrinsic meaning that should allow their use directly. But to be compliant with the standard, our C code had this boilerplate up near the top:
This, of course, only served to make every page of code harder to read. And it didn't really even solve the problem it was meant to. We once found in some source code: which of course did nothing to show us why that particular value was relevant.